Debugging

General Debugging

For debugging usage of the carb.tasking plugin, there are some easy tricks that can get you started. In general, debugging on Windows with Visual Studio 2019 or later is recommended as GUI tools like Parallel Stacks can make problem visualization easier. Debugging with VSCode or GDB is far more challenging.

debugTaskBacktrace

It can also be difficult to identify a task or what queued the task. This is why the debugTaskBacktrace debug setting exists. When symbols are available this will capture the callstack that spawned the task. When a task is actively running on a worker thread the carb::tasking::TaskBundle::tryExecute() frame of the callstack will have a DebugTaskCreationCallstack local variable that will be the callstack that created the task. For tasks that are off-thread and waiting, the callstack leading up to the wait is also captured.

../../_images/tasking-debugTaskBacktrace.png

Showing All Tasks

You can see a list of all tasks known to the system by adding this to your watch window: carb.tasking.plugin.dll!carb::tasking::Scheduler::s_scheduler->m_taskHandleDb

When using the carb.natvis file that ships with Carbonite, this allows inspection of all tasks currently known to the tasking system. The TaskContext (handle) identifying the task is shown under the Name column. The Value column for a task shows a state followed by the function that the task will execute. The tasks can be opened further in the debugger to see additional info about the task. The possible task states are as follows:

  • [pending] - This task has unmet prerequisites, so it cannot be started. While the prerequisites themselves are not immediately visible, if debugTaskBacktrace is enabled, the creation callstack for the task ([value] -> m_backtrace -> [ptr] -> [creation]) may yield clues.

  • [new] - This task is ready to run and is waiting for an available worker thread. A fiber typically has not been assigned to this task yet.

  • [running] - This task is currently running on one of the worker threads.

  • [waiting] - This task has started but is waiting. If debugTaskBacktrace is enabled, the waiting callstack is available under [value] -> m_backtrace -> [ptr].

  • [finished/canceled] - This task is finished or canceled. This entry is uncommon as typically the task would be released at this point, but something is retaining a reference to the task.

../../_images/tasking-debugTasks.png

Debugging Deadlocks

The carb.tasking plugin works with a thread pool that has a limited number of threads. If all of these threads become blocked then the system will deadlock.

Emergency Threads

By default, if carb.tasking sees a certain amount of time pass and no tasking threads have yielded their current fiber while other fibers are ready to resume (note: this does not include new tasks), it considers itself “stuck.” When this happens, it will issue a log warning (carb.tasking is likely stuck) and start an emergency worker thread. The emergency thread will select the next available fiber to resume and exit once the task yields or finishes.

Note

This behavior is controlled by the setting key stuckCheckSeconds.

Running out of Fibers

If the number of fibers has been limited due to a carb::tasking::ITasking::changeParameters call, or an excessive number of tasks have started but are waiting, the system will starve for lack of fibers. This can be identified by warning log messages that state “Out of fibers; too many tasks are waiting.” If this message is appearing in the logs, it can be helpful to show all tasks and see what the [waiting] tasks are waiting on.

In general, it is much more efficient to use carb::tasking::ITasking::addSubTask() to wait on a carb::tasking::RequiredObject than calling a wait function from within a lambda. The former does not require a fiber until the RequiredObject becomes signaled, whereas the latter requires a fiber which is then suspended until RequiredObject becomes signaled.

Waiting in a non-Fiber-Safe Manner

Deadlocks can also occur when all worker threads are running tasks that are waiting in a non-fiber-safe manner. For instance, all worker threads could be running a task that awaits a std::condition_variable (which is not fiber-safe). However, the task that would signal the condition_variable and release a task cannot resume because all task threads are busy. In this case, using a fiber-aware primitive such as carb::tasking::ConditionVariableWrapper will allow the worker threads to run other tasks while a task is blocked. The carb.tasking plugin provides many synchonization primitives that are fiber-aware, and many are direct replacements for std or carb primitives.

Another example that can lead to deadlocks (or general slowness) is multiple tasks waiting on I/O. By having a task wait on I/O, the task thread running it cannot execute other tasks. A possible solution is to have a dedicated I/O thread. Tasks can wait in a fiber-safe way by calling carb::tasking::ITasking::suspendTask, allowing the worker threads to perform other work. Once the I/O operation is complete, the dedicated thread could wake the task with carb::tasking::ITasking::wakeTask.

debugWaitingTasks

Waiting tasks are typically not visible to the debugger, but there exists a debug setting: debugWaitingTasks. This setting is very performance intensive but will allocate a thread for each waiting task so that they can be viewed in Visual Studio’s Parallel Stacks display. With this setting enabled, each waiting task is visible as a separate thread with Scheduler::__WAITING_TASK__ in its callstack.

../../_images/tasking-debugWaitingTasks.png