# Debugging ## Logs There are a few different logs that may be useful when debugging: * **API, Server and Worker logs**: By default these logs are in `LOG_DIR/..log` for GKE installations and `LOG_DIR/.log` for local installations. For GKE the pod names will contain `api`, `server` or `worker` depending on their role in the cluster. Note that these paths may be overridden by the `TURBINIA_LOG_FILE` environment variable, so depending on your installation configuration it may be at a different location. * **Task logs**: These are logs that are generated by Tasks running in the Worker. These logs are in the Task output directory, which is in `OUTPUT_DIR//--/`. The task logs will also show up as saved output for the task and the paths can be determined by doing `turbinia-client status task -a` (note that -a is required to show all data associated with the tasks). To retrieve all files associated with a given task, including the task logs, you can run `turbinia-client result task ` and it will download a compressed file with all of the output. * **Task dependency execution logs**: These are the logs that are generated by binaries executed from a Task. For example, Plaso generates its own log file, and that gets saved by the Task. As with the Task logs, these are stored in the Task output directory. See the above `Task logs` section for how to determine the paths and retrieve the output. * **Google Cloud Error Reporting / Stackdriver Logs**: If you have deployed Turbinia in GKE and/or have `STACKDRIVER_TRACEBACK` enabled in the config, the logs and/or traceback exceptions will also be logged to GCP Monitoring for the same project that Turbinia is in. ## Request Debugging ### Finding previous request data When processing requests are submitted via the client (e.g. `turbinia-client submit `) it will print out the request ID for that request. This can be used with `turbinia-client status request ` to get the current state of the processing request. Alternatively you can also use the [web ui](turbinia-web-ui.md) to get details and download the output as well. For more details on using the client [see the documentation here](turbinia-client.md). If you do not have your request ID, you can list a summary of all recent requests and their request IDs with: `turbinia-client status summary`. ## Task Debugging ### Full Task report data To see the request output you can specify `turbinia-client status request `. By default this output will be filtered to only show high priority Task report output (as determined by the Task at runtime) in order to filter out uninteresting report info. You can specify `-p ` to set what priority you want to show in the output report. Priorities [are defined in the Priority class here](https://github.com/google/turbinia/blob/0e575693eedf363468ee2ca666e8e7e9643e7ef8/turbinia/workers/__init__.py) and can range from `0` to `100` where priority `0` is the highest and `100` is the lowest priority and so to see all report output you can specify `-p 100`. To also show all saved files you can specify `-a`. Putting these together will show all output and can be quite a large amount of data: `turbinia-client status request -a -p 100`. ### Determining Worker status If you want to get a view of the Workers and what Tasks they are running, you can specify `turbinia-client status workers` ### Writing debug logs for binary dependencies Turbinia Tasks have a debug tasks option that can be enabled so that the task will turn on debug output for any binary dependencies that support a debug flag. This is useful when debugging various task and dependency issues. This can be expensive so it is turned off by default. You can specify a Recipe with `debug_tasks = True` when submitting a request to enable this for a single request or you can set `DEBUG_TASKS = True` in the config file to turn on these debug logs for all Tasks. See [here for information on creating and using Recipes](recipes.md). ### Tasks stuck in Pending state If you create a processing request and it hangs with all Tasks in a pending state, this is usually because there are no Workers available to accept the Tasks for execution. This can be because either all the Workers are executing other Tasks or because the Workers are failing to come up for some reason. You can use `turbinia-client status workers` to see what Turbinia thinks the states of the workers are. If the Workers are crashing though, this will not show the true state of the Workers so you may need to tail the Worker logs to see what is going on (see the `Server and Worker Logs` section above for details on how to do that). ## Example Debugging Scenario The following is an example of using `turbinia-client` to track down the recent processing requests and debugging a related Task failure. Get the summary detail of the recent requests: ``` $ turbinia-client status summary ## Request ID: 5198949e3fdb8dk569de1268000f97d8 * Last Update: 2024-04-13T01:42:32.821413Z * Requester: turbiniauser * Reason: * Status: successful * Failed tasks: 0 * Running tasks: 0 * Successful tasks: 16 * Task Count: 16 * Queued tasks: 0 * Evidence Name: GoogleCloudDisk:processed-disk-name1 * Evidence ID: None ## Request ID: 63de1268000f97d85198949e3fdb8dk5 * Last Update: 2024-04-13T01:45:33.141508Z * Requester: turbiniauser * Reason: * Status: completed_with_errors * Failed tasks: 2 * Running tasks: 0 * Successful tasks: 14 * Task Count: 16 * Queued tasks: 0 * Evidence Name: GoogleCloudDisk:processed-disk-name2 * Evidence ID: 79fa702fe21a476f8a560dab4fcdae23 ``` The above output shows that there were two requests recently executed, but doesn't show anything about the Tasks associated with those requests and now that we have the request IDs we can use that to find more info. `turbinia-client status request ` will get the details for the given request ID and the associated Tasks that were generated to process the input Evidence. ``` $ turbinia-client status request 63de1268000f97d85198949e3fdb8dk5 ## Request ID: 63de1268000f97d85198949e3fdb8dk5 * Last Update: 2024-04-13T01:45:33.141508Z * Requester: turbiniauser * Reason: * Status: completed_with_errors * Failed tasks: 2 * Running tasks: 0 * Successful tasks: 14 * Task Count: 16 * Queued tasks: 0 * Evidence Name: GoogleCloudDisk:processed-disk-name2 * Evidence ID: 79fa702fe21a476f8a560dab4fcdae23 [...snip other Task details...] ### PsortTask (MEDIUM PRIORITY): Execution failed with status 1 ### PlasoParserTask (MEDIUM PRIORITY): Completed successfully in 0:00:14.178579 on turbinia-worker-1 ``` In the example output above we can see that there was a failure of the `PsortTask`, but we don't have any other further details because by default it will only show those details for Tasks with high priority findings in order to keep the output manageable. To show more details for Tasks with lower priorities we can set the priority filter with `--priority_filter` or `-p` and to see all tasks we can set `-p 100` which is the lowest priority. In order to get the Task IDs and output file paths we can specify `-a` to get all info we have about that Task, and this will include the request Id, Task Id, and the other logs associated with the task. ``` $ turbinia-client status request -p 100 -a 63de1268000f97d85198949e3fdb8dk5 ## Request ID: 63de1268000f97d85198949e3fdb8dk5 * Last Update: 2024-04-13T01:45:33.141508Z * Requester: turbiniauser * Reason: * Status: completed_with_errors * Failed tasks: 2 * Running tasks: 0 * Successful tasks: 14 * Task Count: 16 * Queued tasks: 0 * Evidence Name: GoogleCloudDisk:processed-disk-name2 * Evidence ID: 79fa702fe21a476f8a560dab4fcdae23 [...snip other request details...] ## PsortTask (MEDIUM PRIORITY) * **Evidence:** GoogleCloudDisk:processed-disk-name2 * **Status:** Execution failed with status 1 * Task Id: 5159b8915b5b4443bcbf3b6c2b9e7cf2 * Executed on worker turbinia-worker-1 ### Saved Task Files: * `/mnt/turbiniavolume/output/63de1268000f97d85198949e3fdb8dk5/1713143967-5159b8915b5b4443bcbf3b6c2b9e7cf2-PsortTask/stderr-hv6i_eb3.txt` * `/mnt/turbiniavolume/output/63de1268000f97d85198949e3fdb8dk5/1713143967-5159b8915b5b4443bcbf3b6c2b9e7cf2-PsortTask/stdout-2zgr_mal.txt` * `/mnt/turbiniavolume/output/63de1268000f97d85198949e3fdb8dk5/1713143967-5159b8915b5b4443bcbf3b6c2b9e7cf2-PsortTask/stderr-p1g6kwoh.txt` * `/mnt/turbiniavolume/output/63de1268000f97d85198949e3fdb8dk5/1713143967-5159b8915b5b4443bcbf3b6c2b9e7cf2-PsortTask/4148a8915b5b4443bcbf3b6c2b9e7cf2.log` * `/tmp/63de1268000f97d85198949e3fdb8dk5/1713143967-5159b8915b5b4443bcbf3b6c2b9e7cf2-PsortTask/4148a8915b5b4443bcbf3b6c2b9e7cf2.csv` * `/mnt/turbiniavolume/output/63de1268000f97d85198949e3fdb8dk5/1713143967-5159b8915b5b4443bcbf3b6c2b9e7cf2-PsortTask/4148a8915b5b4443bcbf3b6c2b9e7cf2.csv` * `/mnt/turbiniavolume/output/63de1268000f97d85198949e3fdb8dk5/1713143967-5159b8915b5b4443bcbf3b6c2b9e7cf2-PsortTask/worker-log.txt` ## PlasoParserTask (MEDIUM PRIORITY) * **Evidence:** GoogleCloudDisk:processed-disk-name2 * **Status:** Completed successfully in 0:00:14.178579 on turbinia-worker-1 * Task Id: a34a3d37c4365d383c3f617169d04327 * Executed on worker turbinia-worker-1 ### Saved Task Files: * `/mnt/turbiniavolume/output/63de1268000f97d85198949e3fdb8dk5/1713143969-a34a3d37c4365d383c3f617169d04327-PlasoParserTask/stderr-x59665r7.txt` * `/mnt/turbiniavolume/output/63de1268000f97d85198949e3fdb8dk5/1713143969-a34a3d37c4365d383c3f617169d04327-PlasoParserTask/stdout-0uqvpe6d.txt` * `/mnt/turbiniavolume/output/63de1268000f97d85198949e3fdb8dk5/1713143969-a34a3d37c4365d383c3f617169d04327-PlasoParserTask/stderr-j71qrl_u.txt` * `/mnt/turbiniavolume/output/63de1268000f97d85198949e3fdb8dk5/1713143969-a34a3d37c4365d383c3f617169d04327-PlasoParserTask/a34a3d37c4365d383c3f617169d04327.log` * `/tmp/63de1268000f97d85198949e3fdb8dk5/1713143969-a34a3d37c4365d383c3f617169d04327-PlasoParserTask/a34a3d37c4365d383c3f617169d04327.plaso` * `/mnt/turbiniavolume/output/63de1268000f97d85198949e3fdb8dk5/1713143969-a34a3d37c4365d383c3f617169d04327-PlasoParserTask/a34a3d37c4365d383c3f617169d04327.plaso` * `/mnt/turbiniavolume/output/63de1268000f97d85198949e3fdb8dk5/1713143969-a34a3d37c4365d383c3f617169d04327-PlasoParserTask/a34a3d37c4365d383c3f617169d04327.plaso.metadata.json` * `/mnt/turbiniavolume/output/63de1268000f97d85198949e3fdb8dk5/1713143969-a34a3d37c4365d383c3f617169d04327-PlasoParserTask/worker-log.txt` ``` The above output gives us several paths that are useful for debugging: * The worker log (`worker-log.txt`) will show a subset of the logs from the Task code. This will have anything that is logged from the Task using `TurbiniaResult.log`. * The stderr and stdout for all executions of binary dependencies (e.g. Plaso) are available in the filenames prefixed with `stderr-` or `stdout-` respectively. * Paths in `/tmp/` are the temporary locations for output and are local to the Worker that executed that Task. The hostname for the Worker this Task was executed on is also listed in the output above. To get all of the output files that have been saved to the `OUTPUT_DIR` you can run the following command: ``` $ turbinia-client result task 5159b8915b5b4443bcbf3b6c2b9e7cf2 Saving output for task 5159b8915b5b4443bcbf3b6c2b9e7cf2 to: 5159b8915b5b4443bcbf3b6c2b9e7cf2.tgz ``` Not all files will be copied into the `OUTPUT_DIR` and some temporary files will remain in `TMP_DIR` on the worker. In the example above, these are the files in the `Saved Task Files` list that start with `/tmp`. The method to get these files from the worker depends on the your instance configuration. As an example, here is how you might gather the `4148a8915b5b4443bcbf3b6c2b9e7cf2.csv` from a worker running in a GKE/Kubernetes cluster. ``` $ gcloud container clusters get-credentials turbinia-cluster --zone us-central1-f --project turbinia-project-name $ kubectl exec turbinia-worker-1 -- cat /tmp/63de1268000f97d85198949e3fdb8dk5/1713143967-5159b8915b5b4443bcbf3b6c2b9e7cf2-PsortTask/4148a8915b5b4443bcbf3b6c2b9e7cf2.csv` ``` *Note:* In this example case we wouldn't actually need to do this because that same file was available in the output we were able to get with `turbinia-client result task `.