Carbonite Audio Capture Interface¶

Overview¶

The IAudioCapture interface provides a means for capturing audio data from capture sources. Sources could be a microphone, headset, line in, etc. The interface is setup similar to that of IAudioPlayback - a context object must be created and a device selected, then the context operated on to receive audio data. Device enumeration is supported and a device is selected either when the context is created or with a call to the IAudioCapture::setSource() function. A context may only capture data from a single source at any given time.

Once the context is created, it will be in an idle state. In order to begin capturing audio from the selected source (assuming it is valid), the host app must start the capture operation with IAudioCapture::captureStart(). A capture operation may be started in either looping or non-looping modes. A non-looping capture will run until the buffer has been filled. At this point it will be automatically stopped. A looping capture will fill the buffer repeatedly. Once the end of the buffer has been reached, it will continue capturing at the start of the buffer again. A looping capture must be explicitly stopped either by calling IAudioCapture::captureStop() or by destroying the context.

The host app is free to create multiple capture contexts if needed. It is not guaranteed that multiple contexts will be able to operate on a given source simultaneously. This behaviour will be system and device dependent - some operating systems, drivers, or devices may allow for multiple simultaneous captures from a single device while others may not. In general it is best practice to only have a single capture context open on any given sound source device.

Capture buffers may be looping. A looping buffer will continuously overwrite previous data each time it reaches the end of the buffer. This mode is useful for live streaming situations where audio data needs to be consumed and acted on in small chunks (ie: played back live, streamed to other users or systems, etc). In the case of looping buffers, the host app is required to ensure that data is being read from the buffer fast enough that the buffer won’t overwrite itself after it reaches the end and starts over again. A non-looping buffer could be used for offline recording sessions where the length of the recording is well known (ie: record the next 10 seconds of data).

The captured data can be accessed in one of two ways:

by periodically calling IAudioCapture::lock() and IAudioCapture::unlock(). This will directly retrieve a segment of the capture device’s buffer that you can access. A downside of this is that the data needs to be consumed quickly so that a capture overrun does not occur.
by periodically calling IAudioCapture::read(). This will copy data from the capture buffer to a local buffer.

Both methods will be able to read the same data, but in some applications, using lock() and unlock() would be more efficient or convenient because it does not require data to be copied to a local buffer. The audio device’s buffer is divided into a set of 8 chunks and each new chunk becomes available to read with IAudioCapture::read() or IAudioCapture::lock() after it is filled. If all 8 chunks becomes filled while looping capture is occurring, this is a capture overrun as the capture device is now overwriting data that is yet to be written. If the capture device’s cursor hits a region locked by IAudioCapture::lock(), this is also an overrun. If an overrun occurs, the contents of the buffer will be cleared but the capture cursor will not be reset; this detail was a result of a deficiency of DSound, which was used in an older version of IAudioCapture.

The frame rate, sample size, sample type, and channel count can be set when the context is created or a new source is selected with setSource(). All captured streams will consist of interleaved channels of data. Each frame of the stream will consist of exactly one sample for each captured channel. The length of the capture buffer can be set when creating the context or selecting the device.

Once a capture has been stopped, whether by ending naturally for a non-looping capture or by being explicitly stopped with captureStop(), the last contents of the capture buffer will remain valid. Data can still be read when the device is stopped, if any data was unread from the buffer before the device was stopped.

Typical Usage¶

the simplest way to use the IAudioCapture interface is to create a new context by passing nullptr to createContext(). This will create a new context that uses the system’s default capture device as its source and uses a small (around 100ms) buffer for capture. This type of context is intended to perform a looping capture with frequent read operations. The data can be read from the context by repeadedly calling read() every 25ms to 50ms. When the capture operation is complete, the context can be destroyed with destroyContext().
device enumeration can be performed using getDeviceCount() and getDeviceCaps(). This can be used to search for a specific capture device to use as the source for a new context. Once an appropriate device has been found, a new context can be created with createContext() specifying the index of the chosen device. The capture can then be performed as above.
an existing capture context can be switched to a new source at any time when a capture is not running. This can be done with setSource(). The device can be chosen using the device enumeration method described above. The captured data format, buffer size, and channel layout can be chosen at this time as well.