Benchmarking#

process() is a blocking call that takes a complete audio clip and returns the final transcript. It is suited for benchmarking throughput and accuracy against a fixed audio dataset.

For real-time or streaming use, see Overview.

Asr.process()#

Asr.process(data: bytes) -> AsrTranscript

Pushes all of data through the ASR pipeline, waits for the neural network to finish, and returns the transcript. The call blocks until the entire input has been processed; it does not return partial results.

process() can be called multiple times on the same Asr instance. Call reset() between clips: without it, audio state from one clip carries over into the next and can reduce accuracy.

Parameters#

Parameter

Type

Description

data

bytes

PCM audio encoded as a little-endian 16-bit byte array. The sample rate must match the value configured on the Asr instance at construction time. See Input format for format requirements.

Return value#

Returns an AsrTranscript. Call .text to get the final string.

Example#

from abr_sdk.asr import Asr

LIBRARY_PATH = "/path/to/niagara-38m-live.en/libniagara_38m_live.so"

with open("clip1.pcm", "rb") as f:
    clip1 = f.read()

with open("clip2.pcm", "rb") as f:
    clip2 = f.read()

with Asr(library_path=LIBRARY_PATH) as asr:
    transcript1 = asr.process(clip1)
    print(transcript1.text)

    asr.reset()

    transcript2 = asr.process(clip2)
    print(transcript2.text)

Each .pcm file must contain raw PCM audio in the format the Asr instance expects. See Input format for details on sample rate, bit depth, and channel layout.

Errors#

Exception

When raised

RuntimeError

Called on a closed Asr instance, for example after the with block has exited or after asr.close() was called explicitly.

abr_sdk.core.ApplicationError

The underlying C library returned a failure status. Check the log output for details.