Sounddevice callbacks without Python interpreter

Published

February 27, 2025

Abstract
Numba compiles Python functions to machine code. A C function-pointer to that machine code can be passed directly to PortAudio as callback. This enables real-time audio processing without the Python interpreter in the callback loop.

Anyone working with audio and Python should know about sounddevice. It provides Python bindings to the PortAudio library using cffi together with a pythonic API for using them. It’s quite straightforward: to process some audio input in real-time, implement the processing of a single block as a callback and initiate a sounddevice.Stream with it as a context manager.

from sounddevice import Stream, sleep

def callback(indata, outdata, frames, **kwargs):
    """Passthrough."""
    outdata[:] = indata
    return 0

with Stream(callback=callback) as stream:
    sleep(1000)

This starts a high-priority thread in the background which runs the callback on each new block of input audio. The thread is closed when the context manager exits.

Let’s dig a little deeper into how this works.

To open a stream in PortAudio, one needs to pass a streamCallback argument, which is a pointer to a PaStreamCallback function. Sounddevice uses cffi’s old-style callback mechanism to turn the Python callback into a <cdata> Python object that represents a C function-pointer.

from sounddevice import _ffi

@_ffi.callback('PaStreamCallback')
def callback_ptr(c_indata, c_outdata, frames):
    indata = convert_pointer_to_numpy_view(c_indata, frames)
    outdata = convert_pointer_to_numpy_view(c_outdata, frames)
    return callback(indata, outdata, frames)

This Python-wrapped-C-pointer is then passed to PortAudio’s Pa_OpenStream function via the cffi interface, which unwraps the Python object and passes the pointer at the C level.

from sounddevice import _lib

ptr = _ffi.new('PaStream**')
_lib.Pa_OpenStream(ptr, ..., callback_ptr, ...)
_lib.Pa_StartStream(ptr[0])
# Sleep and relax
_lib.Pa_StopStream(ptr[0])

Note that for each incoming block of audio, the callback switches to the Python level, wraps the C inputs into Python objects and calls the Python callback. The Python interpreter might dynamically allocate memory or do garbage collection. Depending on the blocksize and the amount of computation in the callback, this can easily lead to buffer overruns.

Interestingly, we can avoid any execution of Python in the callback by compiling to machine code using Numba.

Numba is well known for its just-in-time compiler. Decorating a Python function with numba.jit makes most numerical Python code much faster.

from numba import jit

@jit
def mathy_fun(x, y):
    return x + y

mathy_fun(1, 2)  # first time compiles
mathy_fun(3, 4)  # second time calls machine code -> fast

This month I learned that Numba can also return a cffi-style pointer to the compiled function. And sounddevice._StreamBase accepts those!

import numpy as np
from numba import cfunc, carray
from numba.core.typing.cffi_utils import map_type

@cfunc(map_type(_ffi.typeof("PaStreamCallback*")))
def numba_callback(c_indata, c_outdata, frame_count, ...):
    indata = carray(c_indata, frame_count, dtype=np.float32)
    outdata = carray(c_outdata, frame_count, dtype=np.float32)
    outdata[:] = indata
    return 0

Again, we must unwrap the C pointers into NumPy arrays. This time however, the whole Python function is compiled and can be run inside the streaming thread without the Python interpreter.

from sounddevice import _StreamBase

with _StreamBase(
    "duplex",
    callback=numba_callback.cffi
) as stream:
    sleep(1)

I am curious for how far one can go with prototyping real-time DSP algorithms this way!