---
myst:
  html_meta:
    'description': 'Reference for the PCM audio format required by the ABR SDK ASR API, including sample rate, bit depth, channel count, and buffer sizing.'
    'keywords': 'asr, audio format, pcm, s16le, sample rate, mono, buffer'
---

# Input format

The ASR API accepts raw PCM (uncompressed audio) bytes. This page describes the required format
and how to size the audio buffers you pass to {py:meth}`~abr_sdk.asr.Asr.push`.

## Sizing audio buffers

{py:meth}`~abr_sdk.asr.Asr.push` expects audio in fixed-duration chunks. Compute the byte length
for a chunk, use it as the read size when pulling from your audio source, then pass the resulting
bytes to {py:meth}`~abr_sdk.asr.Asr.push`:

```python
SAMPLE_RATE = 16000
BYTES_PER_SAMPLE = 2
CHUNK_DURATION_S = 0.1  # 100 ms

CHUNK_BYTES = int(SAMPLE_RATE * BYTES_PER_SAMPLE * CHUNK_DURATION_S)  # 3200

with Asr(library_path=LIBRARY_PATH) as asr:
    while chunk := sys.stdin.buffer.read(CHUNK_BYTES):
        asr.push(chunk, on_chunk=on_chunk)
    asr.wait_for_completion()
```

To compute the byte length for a given duration:

```text
bytes = sample_rate × bytes_per_sample × duration_in_seconds
      = 16000 × 2 × duration_in_seconds
```

For example, 100 milliseconds of audio at 16,000 Hz is 3,200 bytes. Smaller chunks reduce per-chunk
latency; larger chunks reduce the number of {py:meth}`~abr_sdk.asr.Asr.push` calls. The model does
not require a specific chunk size.

## Required format

All PCM data passed to {py:meth}`~abr_sdk.asr.Asr.push` must conform to the following specification.

| Property    | Value                       |
| ----------- | --------------------------- |
| Encoding    | Signed 16-bit integer (S16) |
| Byte order  | Little-endian (LE)          |
| Channels    | Mono (1 channel)            |
| Sample rate | 16,000 Hz                   |
| Container   | Raw bytes. No file headers  |

The format is commonly written as **S16_LE** or **s16le** in audio tool documentation. Each sample
is 2 bytes. One second of audio is 32,000 bytes (16,000 samples × 2 bytes per sample).

Audio must currently be 16 kHz mono before it is passed to the SDK. The SDK does not resample. If
your audio source uses a different sample rate or channel layout, you are responsible for converting
it before calling `push()`. See {doc}`/asr/examples` for examples using `ffmpeg`, the `wave` module,
and `soundfile`.

:::{note}
The SDK does not validate the audio content or detect format mismatches. Passing audio in the wrong
format (wrong sample rate, wrong bit depth, stereo instead of mono) produces incorrect transcripts
without raising an error.
:::

## Specifying the sample rate

The {py:class}`~abr_sdk.asr.Asr` constructor accepts a `sample_rate` parameter. Currently, the SDK
does not resample: audio passed to {py:meth}`~abr_sdk.asr.Asr.push` must already be at 16,000 Hz
regardless of what `sample_rate` is set to. Passing audio at a different sample rate does not
trigger conversion; it produces incorrect transcripts without raising an error.

```python
from abr_sdk.asr import Asr

with Asr(library_path=LIBRARY_PATH, sample_rate=16000) as asr:
    ...
```

If `sample_rate` is omitted, the loaded library defaults to 16,000 Hz.

:::{admonition} Next steps
:class: hint

- {doc}`/asr/examples`: Full examples showing how to read audio from a microphone, a WAV file, and
  other sources.
- {doc}`/asr/overview`: Overview of the ASR API.
  :::
