Debugging

Logs

There are a few different logs that may be useful when debugging:

  • API, Server and Worker logs: By default these logs are in LOG_DIR/<gke-pod-name>.<gke-node-name>.log for GKE installations and LOG_DIR/<hostname>.log for local installations. For GKE the pod names will contain api, server or worker depending on their role in the cluster. Note that these paths may be overridden by the TURBINIA_LOG_FILE environment variable, so depending on your installation configuration it may be at a different location.

  • Task logs: These are logs that are generated by Tasks running in the Worker. These logs are in the Task output directory, which is in OUTPUT_DIR/<request_id>/<epoch>-<task_id>-<task_name>/. The task logs will also show up as saved output for the task and the paths can be determined by doing turbinia-client status task <task-id> -a (note that -a is required to show all data associated with the tasks). To retrieve all files associated with a given task, including the task logs, you can run turbinia-client result task <task id> and it will download a compressed file with all of the output.

  • Task dependency execution logs: These are the logs that are generated by binaries executed from a Task. For example, Plaso generates its own log file, and that gets saved by the Task. As with the Task logs, these are stored in the Task output directory. See the above Task logs section for how to determine the paths and retrieve the output.

  • Google Cloud Error Reporting / Stackdriver Logs: If you have deployed Turbinia in GKE and/or have STACKDRIVER_TRACEBACK enabled in the config, the logs and/or traceback exceptions will also be logged to GCP Monitoring for the same project that Turbinia is in.

Request Debugging

Finding previous request data

When processing requests are submitted via the client (e.g. turbinia-client submit <evidence type>) it will print out the request ID for that request. This can be used with turbinia-client status request <request id> to get the current state of the processing request. Alternatively you can also use the web ui to get details and download the output as well. For more details on using the client see the documentation here. If you do not have your request ID, you can list a summary of all recent requests and their request IDs with: turbinia-client status summary.

Task Debugging

Full Task report data

To see the request output you can specify turbinia-client status request <request id>. By default this output will be filtered to only show high priority Task report output (as determined by the Task at runtime) in order to filter out uninteresting report info. You can specify -p <prio num> to set what priority you want to show in the output report. Priorities are defined in the Priority class here and can range from 0 to 100 where priority 0 is the highest and 100 is the lowest priority and so to see all report output you can specify -p 100. To also show all saved files you can specify -a. Putting these together will show all output and can be quite a large amount of data: turbinia-client status request <request id> -a -p 100.

Determining Worker status

If you want to get a view of the Workers and what Tasks they are running, you can specify turbinia-client status workers

Writing debug logs for binary dependencies

Turbinia Tasks have a debug tasks option that can be enabled so that the task will turn on debug output for any binary dependencies that support a debug flag. This is useful when debugging various task and dependency issues. This can be expensive so it is turned off by default. You can specify a Recipe with debug_tasks = True when submitting a request to enable this for a single request or you can set DEBUG_TASKS = True in the config file to turn on these debug logs for all Tasks. See here for information on creating and using Recipes.

Tasks stuck in Pending state

If you create a processing request and it hangs with all Tasks in a pending state, this is usually because there are no Workers available to accept the Tasks for execution. This can be because either all the Workers are executing other Tasks or because the Workers are failing to come up for some reason. You can use turbinia-client status workers to see what Turbinia thinks the states of the workers are. If the Workers are crashing though, this will not show the true state of the Workers so you may need to tail the Worker logs to see what is going on (see the Server and Worker Logs section above for details on how to do that).

Example Debugging Scenario

The following is an example of using turbinia-client to track down the recent processing requests and debugging a related Task failure.

Get the summary detail of the recent requests:

$ turbinia-client status summary

## Request ID: 5198949e3fdb8dk569de1268000f97d8
* Last Update: 2024-04-13T01:42:32.821413Z
* Requester: turbiniauser
* Reason:
* Status: successful
* Failed tasks: 0
* Running tasks: 0
* Successful tasks: 16
* Task Count: 16
* Queued tasks: 0
* Evidence Name: GoogleCloudDisk:processed-disk-name1
* Evidence ID: None

## Request ID: 63de1268000f97d85198949e3fdb8dk5
* Last Update: 2024-04-13T01:45:33.141508Z
* Requester: turbiniauser
* Reason:
* Status: completed_with_errors
* Failed tasks: 2
* Running tasks: 0
* Successful tasks: 14
* Task Count: 16
* Queued tasks: 0
* Evidence Name: GoogleCloudDisk:processed-disk-name2
* Evidence ID: 79fa702fe21a476f8a560dab4fcdae23

The above output shows that there were two requests recently executed, but doesn’t show anything about the Tasks associated with those requests and now that we have the request IDs we can use that to find more info. turbinia-client status request <req id> will get the details for the given request ID and the associated Tasks that were generated to process the input Evidence.

$ turbinia-client status request 63de1268000f97d85198949e3fdb8dk5

## Request ID: 63de1268000f97d85198949e3fdb8dk5
* Last Update: 2024-04-13T01:45:33.141508Z
* Requester: turbiniauser
* Reason:
* Status: completed_with_errors
* Failed tasks: 2
* Running tasks: 0
* Successful tasks: 14
* Task Count: 16
* Queued tasks: 0
* Evidence Name: GoogleCloudDisk:processed-disk-name2
* Evidence ID: 79fa702fe21a476f8a560dab4fcdae23

[...snip other Task details...]

### PsortTask (MEDIUM PRIORITY): Execution failed with status 1
### PlasoParserTask (MEDIUM PRIORITY): Completed successfully in 0:00:14.178579 on turbinia-worker-1

In the example output above we can see that there was a failure of the PsortTask, but we don’t have any other further details because by default it will only show those details for Tasks with high priority findings in order to keep the output manageable.

To show more details for Tasks with lower priorities we can set the priority filter with --priority_filter or -p and to see all tasks we can set -p 100 which is the lowest priority.

In order to get the Task IDs and output file paths we can specify -a to get all info we have about that Task, and this will include the request Id, Task Id, and the other logs associated with the task.

$ turbinia-client status request -p 100 -a 63de1268000f97d85198949e3fdb8dk5

## Request ID: 63de1268000f97d85198949e3fdb8dk5
* Last Update: 2024-04-13T01:45:33.141508Z
* Requester: turbiniauser
* Reason:
* Status: completed_with_errors
* Failed tasks: 2
* Running tasks: 0
* Successful tasks: 14
* Task Count: 16
* Queued tasks: 0
* Evidence Name: GoogleCloudDisk:processed-disk-name2
* Evidence ID: 79fa702fe21a476f8a560dab4fcdae23

[...snip other request details...]

## PsortTask (MEDIUM PRIORITY)
* **Evidence:** GoogleCloudDisk:processed-disk-name2
* **Status:** Execution failed with status 1
* Task Id: 5159b8915b5b4443bcbf3b6c2b9e7cf2
* Executed on worker turbinia-worker-1
### Saved Task Files:
* `/mnt/turbiniavolume/output/63de1268000f97d85198949e3fdb8dk5/1713143967-5159b8915b5b4443bcbf3b6c2b9e7cf2-PsortTask/stderr-hv6i_eb3.txt`
* `/mnt/turbiniavolume/output/63de1268000f97d85198949e3fdb8dk5/1713143967-5159b8915b5b4443bcbf3b6c2b9e7cf2-PsortTask/stdout-2zgr_mal.txt`
* `/mnt/turbiniavolume/output/63de1268000f97d85198949e3fdb8dk5/1713143967-5159b8915b5b4443bcbf3b6c2b9e7cf2-PsortTask/stderr-p1g6kwoh.txt`
* `/mnt/turbiniavolume/output/63de1268000f97d85198949e3fdb8dk5/1713143967-5159b8915b5b4443bcbf3b6c2b9e7cf2-PsortTask/4148a8915b5b4443bcbf3b6c2b9e7cf2.log`
* `/tmp/63de1268000f97d85198949e3fdb8dk5/1713143967-5159b8915b5b4443bcbf3b6c2b9e7cf2-PsortTask/4148a8915b5b4443bcbf3b6c2b9e7cf2.csv`
* `/mnt/turbiniavolume/output/63de1268000f97d85198949e3fdb8dk5/1713143967-5159b8915b5b4443bcbf3b6c2b9e7cf2-PsortTask/4148a8915b5b4443bcbf3b6c2b9e7cf2.csv`
* `/mnt/turbiniavolume/output/63de1268000f97d85198949e3fdb8dk5/1713143967-5159b8915b5b4443bcbf3b6c2b9e7cf2-PsortTask/worker-log.txt`

## PlasoParserTask (MEDIUM PRIORITY)
* **Evidence:** GoogleCloudDisk:processed-disk-name2
* **Status:** Completed successfully in 0:00:14.178579 on turbinia-worker-1
* Task Id: a34a3d37c4365d383c3f617169d04327
* Executed on worker turbinia-worker-1
### Saved Task Files:
* `/mnt/turbiniavolume/output/63de1268000f97d85198949e3fdb8dk5/1713143969-a34a3d37c4365d383c3f617169d04327-PlasoParserTask/stderr-x59665r7.txt`
* `/mnt/turbiniavolume/output/63de1268000f97d85198949e3fdb8dk5/1713143969-a34a3d37c4365d383c3f617169d04327-PlasoParserTask/stdout-0uqvpe6d.txt`
* `/mnt/turbiniavolume/output/63de1268000f97d85198949e3fdb8dk5/1713143969-a34a3d37c4365d383c3f617169d04327-PlasoParserTask/stderr-j71qrl_u.txt`
* `/mnt/turbiniavolume/output/63de1268000f97d85198949e3fdb8dk5/1713143969-a34a3d37c4365d383c3f617169d04327-PlasoParserTask/a34a3d37c4365d383c3f617169d04327.log`
* `/tmp/63de1268000f97d85198949e3fdb8dk5/1713143969-a34a3d37c4365d383c3f617169d04327-PlasoParserTask/a34a3d37c4365d383c3f617169d04327.plaso`
* `/mnt/turbiniavolume/output/63de1268000f97d85198949e3fdb8dk5/1713143969-a34a3d37c4365d383c3f617169d04327-PlasoParserTask/a34a3d37c4365d383c3f617169d04327.plaso`
* `/mnt/turbiniavolume/output/63de1268000f97d85198949e3fdb8dk5/1713143969-a34a3d37c4365d383c3f617169d04327-PlasoParserTask/a34a3d37c4365d383c3f617169d04327.plaso.metadata.json`
* `/mnt/turbiniavolume/output/63de1268000f97d85198949e3fdb8dk5/1713143969-a34a3d37c4365d383c3f617169d04327-PlasoParserTask/worker-log.txt`

The above output gives us several paths that are useful for debugging:

  • The worker log (worker-log.txt) will show a subset of the logs from the Task code. This will have anything that is logged from the Task using TurbiniaResult.log.

  • The stderr and stdout for all executions of binary dependencies (e.g. Plaso) are available in the filenames prefixed with stderr- or stdout- respectively.

  • Paths in /tmp/ are the temporary locations for output and are local to the Worker that executed that Task. The hostname for the Worker this Task was executed on is also listed in the output above.

To get all of the output files that have been saved to the OUTPUT_DIR you can run the following command:

$ turbinia-client result task 5159b8915b5b4443bcbf3b6c2b9e7cf2
Saving output for task 5159b8915b5b4443bcbf3b6c2b9e7cf2 to: 5159b8915b5b4443bcbf3b6c2b9e7cf2.tgz

Not all files will be copied into the OUTPUT_DIR and some temporary files will remain in TMP_DIR on the worker. In the example above, these are the files in the Saved Task Files list that start with /tmp. The method to get these files from the worker depends on the your instance configuration. As an example, here is how you might gather the 4148a8915b5b4443bcbf3b6c2b9e7cf2.csv from a worker running in a GKE/Kubernetes cluster.

$ gcloud container clusters get-credentials turbinia-cluster --zone us-central1-f --project turbinia-project-name

$ kubectl exec turbinia-worker-1 -- cat /tmp/63de1268000f97d85198949e3fdb8dk5/1713143967-5159b8915b5b4443bcbf3b6c2b9e7cf2-PsortTask/4148a8915b5b4443bcbf3b6c2b9e7cf2.csv`

Note: In this example case we wouldn’t actually need to do this because that same file was available in the output we were able to get with turbinia-client result task <task id>.