# Debugging

## Logs

There are a few different logs that may be useful when debugging:

*   **API, Server and Worker logs**: By default these logs are in
    `LOG_DIR/<gke-pod-name>.<gke-node-name>.log` for GKE installations and
    `LOG_DIR/<hostname>.log` for local installations.  For GKE the pod names
    will contain `api`, `server` or `worker` depending on their role in the
    cluster.  Note that these paths may be overridden by the `TURBINIA_LOG_FILE`
    environment variable, so depending on your installation configuration it may
    be at a different location.
*   **Task logs**: These are logs that are generated by Tasks running in the
    Worker. These logs are in the Task output directory, which is in
    `OUTPUT_DIR/<request_id>/<epoch>-<task_id>-<task_name>/`.  The task logs
    will also show up as saved output for the task and the paths can be
    determined by doing `turbinia-client status task <task-id> -a` (note that -a
    is required to show all data associated with the tasks).  To retrieve
    all files associated with a given task, including the task logs, you can
    run `turbinia-client result task <task id>` and it will download a
    compressed file with all of the output.
*   **Task dependency execution logs**: These are the logs that are generated by
    binaries executed from a Task. For example, Plaso generates its own log
    file, and that gets saved by the Task. As with the Task logs, these are
    stored in the Task output directory.  See the above `Task logs` section for
    how to determine the paths and retrieve the output.
*   **Google Cloud Error Reporting / Stackdriver Logs**: If you have deployed
    Turbinia in GKE and/or have `STACKDRIVER_TRACEBACK` enabled in the config,
    the logs and/or traceback exceptions will also be logged to GCP Monitoring
    for the same project that Turbinia is in.

## Request Debugging

### Finding previous request data
When processing requests are submitted via the client (e.g.
`turbinia-client submit <evidence type>`) it will print out the request ID for
that request. This can be used with
`turbinia-client status request <request id>` to get the current state of the
processing request.  Alternatively you can also use the [web
ui](turbinia-web-ui.md) to get details and download the output as well.  For
more details on using the client [see the documentation
here](turbinia-client.md).  If you do not have your request ID, you can list a
summary of all recent requests and their request IDs with:
`turbinia-client status summary`.


## Task Debugging

### Full Task report data
To see the request output you can specify
`turbinia-client status request <request id>`. By default this output will be
filtered to only show high priority Task report output (as determined by the
Task at runtime) in order to filter out uninteresting report info. You can
specify `-p <prio num>` to set what priority you want to show in the output
report. Priorities [are defined in the Priority class
here](https://github.com/google/turbinia/blob/0e575693eedf363468ee2ca666e8e7e9643e7ef8/turbinia/workers/__init__.py)
and can range from `0` to `100` where priority `0` is the highest and `100` is
the lowest priority and so to see all report output you can specify `-p 100`.
To also show all saved files you can specify `-a`.  Putting these together will
show all output and can be quite a large amount of data:
`turbinia-client status request <request id> -a -p 100`.

### Determining Worker status
If you want to get a view of the Workers and what Tasks they are running, you
can specify `turbinia-client status workers`


### Writing debug logs for binary dependencies
Turbinia Tasks have a debug tasks option that can be enabled so that the task
will turn on debug output for any binary dependencies that support a debug flag.
This is useful when debugging various task and dependency issues.  This can be
expensive so it is turned off by default. You can specify a Recipe with
`debug_tasks = True` when submitting a request to enable this for a single
request or you can set `DEBUG_TASKS = True` in the config file to turn on these
debug logs for all Tasks.  See [here for information on creating and using
Recipes](recipes.md).

### Tasks stuck in Pending state

If you create a processing request and it hangs with all Tasks in a pending
state, this is usually because there are no Workers available to accept the
Tasks for execution.  This can be because either all the Workers are executing
other Tasks or because the Workers are failing to come up for some reason.  You
can use `turbinia-client status workers` to see what Turbinia thinks the states
of the workers are.  If the Workers are crashing though, this will not show the
true state of the Workers so you may need to tail the Worker logs to see what is
going on (see the `Server and Worker Logs` section above for details on how to
do that).


## Example Debugging Scenario

The following is an example of using `turbinia-client` to track down the
recent processing requests and debugging a related Task failure.


Get the summary detail of the recent requests:
```
$ turbinia-client status summary

## Request ID: 5198949e3fdb8dk569de1268000f97d8
* Last Update: 2024-04-13T01:42:32.821413Z
* Requester: turbiniauser
* Reason:
* Status: successful
* Failed tasks: 0
* Running tasks: 0
* Successful tasks: 16
* Task Count: 16
* Queued tasks: 0
* Evidence Name: GoogleCloudDisk:processed-disk-name1
* Evidence ID: None

## Request ID: 63de1268000f97d85198949e3fdb8dk5
* Last Update: 2024-04-13T01:45:33.141508Z
* Requester: turbiniauser
* Reason:
* Status: completed_with_errors
* Failed tasks: 2
* Running tasks: 0
* Successful tasks: 14
* Task Count: 16
* Queued tasks: 0
* Evidence Name: GoogleCloudDisk:processed-disk-name2
* Evidence ID: 79fa702fe21a476f8a560dab4fcdae23
```

The above output shows that there were two requests recently executed, but
doesn't show anything about the Tasks associated with those requests and now
that we have the request IDs we can use that to find more info.
`turbinia-client status request <req id>` will get the details for the given
request ID and the associated Tasks that were generated to process the input
Evidence.
```
$ turbinia-client status request 63de1268000f97d85198949e3fdb8dk5

## Request ID: 63de1268000f97d85198949e3fdb8dk5
* Last Update: 2024-04-13T01:45:33.141508Z
* Requester: turbiniauser
* Reason:
* Status: completed_with_errors
* Failed tasks: 2
* Running tasks: 0
* Successful tasks: 14
* Task Count: 16
* Queued tasks: 0
* Evidence Name: GoogleCloudDisk:processed-disk-name2
* Evidence ID: 79fa702fe21a476f8a560dab4fcdae23

[...snip other Task details...]

### PsortTask (MEDIUM PRIORITY): Execution failed with status 1
### PlasoParserTask (MEDIUM PRIORITY): Completed successfully in 0:00:14.178579 on turbinia-worker-1
```

In the example output above we can see that there was a failure of the
`PsortTask`, but we don't have any other further details because by default it
will only show those details for Tasks with high priority findings in order to
keep the output manageable.

To show more details for Tasks with lower priorities we can set the priority
filter with `--priority_filter` or `-p` and to see all tasks we can set `-p 100`
which is the lowest priority.


In order to get the Task IDs and output file paths we can specify `-a` to get
all info we have about that Task, and this will include the request Id, Task Id,
and the other logs associated with the task.

```
$ turbinia-client status request -p 100 -a 63de1268000f97d85198949e3fdb8dk5

## Request ID: 63de1268000f97d85198949e3fdb8dk5
* Last Update: 2024-04-13T01:45:33.141508Z
* Requester: turbiniauser
* Reason:
* Status: completed_with_errors
* Failed tasks: 2
* Running tasks: 0
* Successful tasks: 14
* Task Count: 16
* Queued tasks: 0
* Evidence Name: GoogleCloudDisk:processed-disk-name2
* Evidence ID: 79fa702fe21a476f8a560dab4fcdae23

[...snip other request details...]

## PsortTask (MEDIUM PRIORITY)
* **Evidence:** GoogleCloudDisk:processed-disk-name2
* **Status:** Execution failed with status 1
* Task Id: 5159b8915b5b4443bcbf3b6c2b9e7cf2
* Executed on worker turbinia-worker-1
### Saved Task Files:
* `/mnt/turbiniavolume/output/63de1268000f97d85198949e3fdb8dk5/1713143967-5159b8915b5b4443bcbf3b6c2b9e7cf2-PsortTask/stderr-hv6i_eb3.txt`
* `/mnt/turbiniavolume/output/63de1268000f97d85198949e3fdb8dk5/1713143967-5159b8915b5b4443bcbf3b6c2b9e7cf2-PsortTask/stdout-2zgr_mal.txt`
* `/mnt/turbiniavolume/output/63de1268000f97d85198949e3fdb8dk5/1713143967-5159b8915b5b4443bcbf3b6c2b9e7cf2-PsortTask/stderr-p1g6kwoh.txt`
* `/mnt/turbiniavolume/output/63de1268000f97d85198949e3fdb8dk5/1713143967-5159b8915b5b4443bcbf3b6c2b9e7cf2-PsortTask/4148a8915b5b4443bcbf3b6c2b9e7cf2.log`
* `/tmp/63de1268000f97d85198949e3fdb8dk5/1713143967-5159b8915b5b4443bcbf3b6c2b9e7cf2-PsortTask/4148a8915b5b4443bcbf3b6c2b9e7cf2.csv`
* `/mnt/turbiniavolume/output/63de1268000f97d85198949e3fdb8dk5/1713143967-5159b8915b5b4443bcbf3b6c2b9e7cf2-PsortTask/4148a8915b5b4443bcbf3b6c2b9e7cf2.csv`
* `/mnt/turbiniavolume/output/63de1268000f97d85198949e3fdb8dk5/1713143967-5159b8915b5b4443bcbf3b6c2b9e7cf2-PsortTask/worker-log.txt`

## PlasoParserTask (MEDIUM PRIORITY)
* **Evidence:** GoogleCloudDisk:processed-disk-name2
* **Status:** Completed successfully in 0:00:14.178579 on turbinia-worker-1
* Task Id: a34a3d37c4365d383c3f617169d04327
* Executed on worker turbinia-worker-1
### Saved Task Files:
* `/mnt/turbiniavolume/output/63de1268000f97d85198949e3fdb8dk5/1713143969-a34a3d37c4365d383c3f617169d04327-PlasoParserTask/stderr-x59665r7.txt`
* `/mnt/turbiniavolume/output/63de1268000f97d85198949e3fdb8dk5/1713143969-a34a3d37c4365d383c3f617169d04327-PlasoParserTask/stdout-0uqvpe6d.txt`
* `/mnt/turbiniavolume/output/63de1268000f97d85198949e3fdb8dk5/1713143969-a34a3d37c4365d383c3f617169d04327-PlasoParserTask/stderr-j71qrl_u.txt`
* `/mnt/turbiniavolume/output/63de1268000f97d85198949e3fdb8dk5/1713143969-a34a3d37c4365d383c3f617169d04327-PlasoParserTask/a34a3d37c4365d383c3f617169d04327.log`
* `/tmp/63de1268000f97d85198949e3fdb8dk5/1713143969-a34a3d37c4365d383c3f617169d04327-PlasoParserTask/a34a3d37c4365d383c3f617169d04327.plaso`
* `/mnt/turbiniavolume/output/63de1268000f97d85198949e3fdb8dk5/1713143969-a34a3d37c4365d383c3f617169d04327-PlasoParserTask/a34a3d37c4365d383c3f617169d04327.plaso`
* `/mnt/turbiniavolume/output/63de1268000f97d85198949e3fdb8dk5/1713143969-a34a3d37c4365d383c3f617169d04327-PlasoParserTask/a34a3d37c4365d383c3f617169d04327.plaso.metadata.json`
* `/mnt/turbiniavolume/output/63de1268000f97d85198949e3fdb8dk5/1713143969-a34a3d37c4365d383c3f617169d04327-PlasoParserTask/worker-log.txt`
```

The above output gives us several paths that are useful for debugging:
* The worker log (`worker-log.txt`) will show a subset of the logs from the Task
  code.  This will have anything that is logged from the Task using
  `TurbiniaResult.log`.
* The stderr and stdout for all executions of binary dependencies (e.g. Plaso)
  are available in the filenames prefixed with `stderr-` or `stdout-`
  respectively.
* Paths in `/tmp/` are the temporary locations for
  output and are local to the Worker that executed that Task.  The hostname
  for the Worker this Task was executed on is also listed in the output above.


To get all of the output files that have been saved to the `OUTPUT_DIR` you can run the following command:
```
$ turbinia-client result task 5159b8915b5b4443bcbf3b6c2b9e7cf2
Saving output for task 5159b8915b5b4443bcbf3b6c2b9e7cf2 to: 5159b8915b5b4443bcbf3b6c2b9e7cf2.tgz
```

Not all files will be copied into the `OUTPUT_DIR` and some temporary files will
remain in `TMP_DIR` on the worker.  In the example above, these are the files in
the `Saved Task Files` list that start with `/tmp`. The method to get these
files from the worker depends on the your instance configuration.  As an
example, here is how you might gather the `4148a8915b5b4443bcbf3b6c2b9e7cf2.csv`
from a worker running in a GKE/Kubernetes cluster.

```
$ gcloud container clusters get-credentials turbinia-cluster --zone us-central1-f --project turbinia-project-name

$ kubectl exec turbinia-worker-1 -- cat /tmp/63de1268000f97d85198949e3fdb8dk5/1713143967-5159b8915b5b4443bcbf3b6c2b9e7cf2-PsortTask/4148a8915b5b4443bcbf3b6c2b9e7cf2.csv`
```

*Note:* In this example case we wouldn't actually need to do this because that
same file was available in the output we were able to get with
`turbinia-client result task <task id>`.