Benchmarking#
process() is a blocking call that takes a complete audio clip and returns
the final transcript. It is suited for benchmarking throughput and accuracy against a fixed audio
dataset.
For real-time or streaming use, see Overview.
Asr.process()#
Asr.process(data: bytes) -> AsrTranscript
Pushes all of data through the ASR pipeline, waits for the neural network to finish, and returns
the transcript. The call blocks until the entire input has been processed; it does not return
partial results.
process() can be called multiple times on the same
Asr instance. Call reset() between
clips: without it, audio state from one clip carries over into the next and can reduce accuracy.
Parameters#
Parameter |
Type |
Description |
|---|---|---|
|
|
PCM audio encoded as a little-endian 16-bit byte array. The sample rate must match the value configured on the |
Return value#
Returns an AsrTranscript. Call .text to get the final string.
Example#
from abr_sdk.asr import Asr
LIBRARY_PATH = "/path/to/niagara-38m-live.en/libniagara_38m_live.so"
with open("clip1.pcm", "rb") as f:
clip1 = f.read()
with open("clip2.pcm", "rb") as f:
clip2 = f.read()
with Asr(library_path=LIBRARY_PATH) as asr:
transcript1 = asr.process(clip1)
print(transcript1.text)
asr.reset()
transcript2 = asr.process(clip2)
print(transcript2.text)
Each .pcm file must contain raw PCM audio in the format the Asr instance expects. See
Input format for details on sample rate, bit depth, and channel layout.
Errors#
Exception |
When raised |
|---|---|
|
Called on a closed |
|
The underlying C library returned a failure status. Check the log output for details. |