ABR SDK#

The ABR SDK is a Python library for on-device automatic speech recognition (ASR) and text-to-speech (TTS). You download an application package for your target platform, extract it, and start processing speech.

What the SDK does#

The SDK gives you two speech capabilities:

  • Automatic speech recognition (ASR): transcribe audio to text in real time, on your device.

  • Text-to-speech (TTS): synthesize speech from text, on your device.

Both run entirely on your device. No audio or transcript data is sent to an external service during processing.

The technology behind it#

ABR’s speech models are built on state-space models (SSMs). The Legendre Memory Unit (LMU), introduced by ABR in 2019, was one of the first instances of the SSM architecture.

SSMs process input one step at a time and maintain a fixed-size internal state. Unlike transformer models, SSMs do not scale quadratically with input length. They require no buffering, no look-ahead, and no batch accumulation before producing output.

These properties make SSM-based models practical on devices with limited compute and memory. You do not need to understand SSMs to use the SDK. The distinction matters if you are evaluating whether ABR’s models fit your device’s hardware budget.

How the SDK works#

The SDK is a thin Python layer over a stable C application binary interface (ABI). Neural network execution and platform-specific optimizations happen inside a compiled library you download separately, called an application package.

Application packages#

An application package is a .tar.gz archive you download from the ABR dev portal. It contains:

  • The compiled .so shared library for your target platform

  • The neural network weights

  • Supporting configuration files

You extract the package to a directory on your device. When you create an Asr or Tts object, you pass the path to that directory. The SDK loads the library and weights. You do not select a hardware backend or configure execution targets. The application package you downloaded already contains the right code for the platform you chose.

ABR provides application packages for Linux x86-64, Linux ARM64, and Android ARM64.

On-device inference#

All inference runs locally:

  • No round-trips. Transcription latency depends on the model and your hardware, not network conditions.

  • No data leaves the device. Audio and transcript data stay on your device.

  • No internet required at runtime. The only outbound connection is the initial download of the application package from the dev portal.

Who this documentation is for#

This documentation assumes you are a Python developer integrating speech AI into a device or application.

If you want to transcribe audio on your device, start with Installation and ASR quickstart. The quickstart walks from installation to your first transcript.

If you want to synthesize speech on your device, start with TTS quickstart. The quickstart walks from installation to your first synthesized audio.

If you are evaluating whether ABR fits your use case, read Overview or Overview depending on your capability of interest. For ASR, Transcription stages covers the incremental chunk model, including FAST mode, ACCURATE mode, and the post-processing pass, which is useful before choosing ABR for latency-sensitive or display applications. For TTS, Input format covers text preprocessing requirements, including how numbers, abbreviations, and SSML markup are handled.