Debugging
Logs
There are a few different logs that may be useful when debugging:
API, Server and Worker logs: By default these logs are in
LOG_DIR/<gke-pod-name>.<gke-node-name>.logfor GKE installations andLOG_DIR/<hostname>.logfor local installations. For GKE the pod names will containapi,serverorworkerdepending on their role in the cluster. Note that these paths may be overridden by theTURBINIA_LOG_FILEenvironment variable, so depending on your installation configuration it may be at a different location.Task logs: These are logs that are generated by Tasks running in the Worker. These logs are in the Task output directory, which is in
OUTPUT_DIR/<request_id>/<epoch>-<task_id>-<task_name>/. The task logs will also show up as saved output for the task and the paths can be determined by doingturbinia-client status task <task-id> -a(note that -a is required to show all data associated with the tasks). To retrieve all files associated with a given task, including the task logs, you can runturbinia-client result task <task id>and it will download a compressed file with all of the output.Task dependency execution logs: These are the logs that are generated by binaries executed from a Task. For example, Plaso generates its own log file, and that gets saved by the Task. As with the Task logs, these are stored in the Task output directory. See the above
Task logssection for how to determine the paths and retrieve the output.Google Cloud Error Reporting / Stackdriver Logs: If you have deployed Turbinia in GKE and/or have
STACKDRIVER_TRACEBACKenabled in the config, the logs and/or traceback exceptions will also be logged to GCP Monitoring for the same project that Turbinia is in.
Request Debugging
Finding previous request data
When processing requests are submitted via the client (e.g.
turbinia-client submit <evidence type>) it will print out the request ID for
that request. This can be used with
turbinia-client status request <request id> to get the current state of the
processing request. Alternatively you can also use the web
ui to get details and download the output as well. For
more details on using the client see the documentation
here. If you do not have your request ID, you can list a
summary of all recent requests and their request IDs with:
turbinia-client status summary.
Task Debugging
Full Task report data
To see the request output you can specify
turbinia-client status request <request id>. By default this output will be
filtered to only show high priority Task report output (as determined by the
Task at runtime) in order to filter out uninteresting report info. You can
specify -p <prio num> to set what priority you want to show in the output
report. Priorities are defined in the Priority class
here
and can range from 0 to 100 where priority 0 is the highest and 100 is
the lowest priority and so to see all report output you can specify -p 100.
To also show all saved files you can specify -a. Putting these together will
show all output and can be quite a large amount of data:
turbinia-client status request <request id> -a -p 100.
Determining Worker status
If you want to get a view of the Workers and what Tasks they are running, you
can specify turbinia-client status workers
Writing debug logs for binary dependencies
Turbinia Tasks have a debug tasks option that can be enabled so that the task
will turn on debug output for any binary dependencies that support a debug flag.
This is useful when debugging various task and dependency issues. This can be
expensive so it is turned off by default. You can specify a Recipe with
debug_tasks = True when submitting a request to enable this for a single
request or you can set DEBUG_TASKS = True in the config file to turn on these
debug logs for all Tasks. See here for information on creating and using
Recipes.
Tasks stuck in Pending state
If you create a processing request and it hangs with all Tasks in a pending
state, this is usually because there are no Workers available to accept the
Tasks for execution. This can be because either all the Workers are executing
other Tasks or because the Workers are failing to come up for some reason. You
can use turbinia-client status workers to see what Turbinia thinks the states
of the workers are. If the Workers are crashing though, this will not show the
true state of the Workers so you may need to tail the Worker logs to see what is
going on (see the Server and Worker Logs section above for details on how to
do that).
Example Debugging Scenario
The following is an example of using turbinia-client to track down the
recent processing requests and debugging a related Task failure.
Get the summary detail of the recent requests:
$ turbinia-client status summary
## Request ID: 5198949e3fdb8dk569de1268000f97d8
* Last Update: 2024-04-13T01:42:32.821413Z
* Requester: turbiniauser
* Reason:
* Status: successful
* Failed tasks: 0
* Running tasks: 0
* Successful tasks: 16
* Task Count: 16
* Queued tasks: 0
* Evidence Name: GoogleCloudDisk:processed-disk-name1
* Evidence ID: None
## Request ID: 63de1268000f97d85198949e3fdb8dk5
* Last Update: 2024-04-13T01:45:33.141508Z
* Requester: turbiniauser
* Reason:
* Status: completed_with_errors
* Failed tasks: 2
* Running tasks: 0
* Successful tasks: 14
* Task Count: 16
* Queued tasks: 0
* Evidence Name: GoogleCloudDisk:processed-disk-name2
* Evidence ID: 79fa702fe21a476f8a560dab4fcdae23
The above output shows that there were two requests recently executed, but
doesn’t show anything about the Tasks associated with those requests and now
that we have the request IDs we can use that to find more info.
turbinia-client status request <req id> will get the details for the given
request ID and the associated Tasks that were generated to process the input
Evidence.
$ turbinia-client status request 63de1268000f97d85198949e3fdb8dk5
## Request ID: 63de1268000f97d85198949e3fdb8dk5
* Last Update: 2024-04-13T01:45:33.141508Z
* Requester: turbiniauser
* Reason:
* Status: completed_with_errors
* Failed tasks: 2
* Running tasks: 0
* Successful tasks: 14
* Task Count: 16
* Queued tasks: 0
* Evidence Name: GoogleCloudDisk:processed-disk-name2
* Evidence ID: 79fa702fe21a476f8a560dab4fcdae23
[...snip other Task details...]
### PsortTask (MEDIUM PRIORITY): Execution failed with status 1
### PlasoParserTask (MEDIUM PRIORITY): Completed successfully in 0:00:14.178579 on turbinia-worker-1
In the example output above we can see that there was a failure of the
PsortTask, but we don’t have any other further details because by default it
will only show those details for Tasks with high priority findings in order to
keep the output manageable.
To show more details for Tasks with lower priorities we can set the priority
filter with --priority_filter or -p and to see all tasks we can set -p 100
which is the lowest priority.
In order to get the Task IDs and output file paths we can specify -a to get
all info we have about that Task, and this will include the request Id, Task Id,
and the other logs associated with the task.
$ turbinia-client status request -p 100 -a 63de1268000f97d85198949e3fdb8dk5
## Request ID: 63de1268000f97d85198949e3fdb8dk5
* Last Update: 2024-04-13T01:45:33.141508Z
* Requester: turbiniauser
* Reason:
* Status: completed_with_errors
* Failed tasks: 2
* Running tasks: 0
* Successful tasks: 14
* Task Count: 16
* Queued tasks: 0
* Evidence Name: GoogleCloudDisk:processed-disk-name2
* Evidence ID: 79fa702fe21a476f8a560dab4fcdae23
[...snip other request details...]
## PsortTask (MEDIUM PRIORITY)
* **Evidence:** GoogleCloudDisk:processed-disk-name2
* **Status:** Execution failed with status 1
* Task Id: 5159b8915b5b4443bcbf3b6c2b9e7cf2
* Executed on worker turbinia-worker-1
### Saved Task Files:
* `/mnt/turbiniavolume/output/63de1268000f97d85198949e3fdb8dk5/1713143967-5159b8915b5b4443bcbf3b6c2b9e7cf2-PsortTask/stderr-hv6i_eb3.txt`
* `/mnt/turbiniavolume/output/63de1268000f97d85198949e3fdb8dk5/1713143967-5159b8915b5b4443bcbf3b6c2b9e7cf2-PsortTask/stdout-2zgr_mal.txt`
* `/mnt/turbiniavolume/output/63de1268000f97d85198949e3fdb8dk5/1713143967-5159b8915b5b4443bcbf3b6c2b9e7cf2-PsortTask/stderr-p1g6kwoh.txt`
* `/mnt/turbiniavolume/output/63de1268000f97d85198949e3fdb8dk5/1713143967-5159b8915b5b4443bcbf3b6c2b9e7cf2-PsortTask/4148a8915b5b4443bcbf3b6c2b9e7cf2.log`
* `/tmp/63de1268000f97d85198949e3fdb8dk5/1713143967-5159b8915b5b4443bcbf3b6c2b9e7cf2-PsortTask/4148a8915b5b4443bcbf3b6c2b9e7cf2.csv`
* `/mnt/turbiniavolume/output/63de1268000f97d85198949e3fdb8dk5/1713143967-5159b8915b5b4443bcbf3b6c2b9e7cf2-PsortTask/4148a8915b5b4443bcbf3b6c2b9e7cf2.csv`
* `/mnt/turbiniavolume/output/63de1268000f97d85198949e3fdb8dk5/1713143967-5159b8915b5b4443bcbf3b6c2b9e7cf2-PsortTask/worker-log.txt`
## PlasoParserTask (MEDIUM PRIORITY)
* **Evidence:** GoogleCloudDisk:processed-disk-name2
* **Status:** Completed successfully in 0:00:14.178579 on turbinia-worker-1
* Task Id: a34a3d37c4365d383c3f617169d04327
* Executed on worker turbinia-worker-1
### Saved Task Files:
* `/mnt/turbiniavolume/output/63de1268000f97d85198949e3fdb8dk5/1713143969-a34a3d37c4365d383c3f617169d04327-PlasoParserTask/stderr-x59665r7.txt`
* `/mnt/turbiniavolume/output/63de1268000f97d85198949e3fdb8dk5/1713143969-a34a3d37c4365d383c3f617169d04327-PlasoParserTask/stdout-0uqvpe6d.txt`
* `/mnt/turbiniavolume/output/63de1268000f97d85198949e3fdb8dk5/1713143969-a34a3d37c4365d383c3f617169d04327-PlasoParserTask/stderr-j71qrl_u.txt`
* `/mnt/turbiniavolume/output/63de1268000f97d85198949e3fdb8dk5/1713143969-a34a3d37c4365d383c3f617169d04327-PlasoParserTask/a34a3d37c4365d383c3f617169d04327.log`
* `/tmp/63de1268000f97d85198949e3fdb8dk5/1713143969-a34a3d37c4365d383c3f617169d04327-PlasoParserTask/a34a3d37c4365d383c3f617169d04327.plaso`
* `/mnt/turbiniavolume/output/63de1268000f97d85198949e3fdb8dk5/1713143969-a34a3d37c4365d383c3f617169d04327-PlasoParserTask/a34a3d37c4365d383c3f617169d04327.plaso`
* `/mnt/turbiniavolume/output/63de1268000f97d85198949e3fdb8dk5/1713143969-a34a3d37c4365d383c3f617169d04327-PlasoParserTask/a34a3d37c4365d383c3f617169d04327.plaso.metadata.json`
* `/mnt/turbiniavolume/output/63de1268000f97d85198949e3fdb8dk5/1713143969-a34a3d37c4365d383c3f617169d04327-PlasoParserTask/worker-log.txt`
The above output gives us several paths that are useful for debugging:
The worker log (
worker-log.txt) will show a subset of the logs from the Task code. This will have anything that is logged from the Task usingTurbiniaResult.log.The stderr and stdout for all executions of binary dependencies (e.g. Plaso) are available in the filenames prefixed with
stderr-orstdout-respectively.Paths in
/tmp/are the temporary locations for output and are local to the Worker that executed that Task. The hostname for the Worker this Task was executed on is also listed in the output above.
To get all of the output files that have been saved to the OUTPUT_DIR you can run the following command:
$ turbinia-client result task 5159b8915b5b4443bcbf3b6c2b9e7cf2
Saving output for task 5159b8915b5b4443bcbf3b6c2b9e7cf2 to: 5159b8915b5b4443bcbf3b6c2b9e7cf2.tgz
Not all files will be copied into the OUTPUT_DIR and some temporary files will
remain in TMP_DIR on the worker. In the example above, these are the files in
the Saved Task Files list that start with /tmp. The method to get these
files from the worker depends on the your instance configuration. As an
example, here is how you might gather the 4148a8915b5b4443bcbf3b6c2b9e7cf2.csv
from a worker running in a GKE/Kubernetes cluster.
$ gcloud container clusters get-credentials turbinia-cluster --zone us-central1-f --project turbinia-project-name
$ kubectl exec turbinia-worker-1 -- cat /tmp/63de1268000f97d85198949e3fdb8dk5/1713143967-5159b8915b5b4443bcbf3b6c2b9e7cf2-PsortTask/4148a8915b5b4443bcbf3b6c2b9e7cf2.csv`
Note: In this example case we wouldn’t actually need to do this because that
same file was available in the output we were able to get with
turbinia-client result task <task id>.