abr_sdk.tts#

API for running TTS (text-to-speech) applications.

Author: Pawel Jaworski

Module Contents#

Classes#

Tts

Synthesize speech audio from input text.

Processor

Event loop driving a streaming TTS session.

API#

class abr_sdk.tts.Tts(lib_or_path: str | pathlib.Path | abr_sdk.core.Library, *, lib_search_paths: list[str | pathlib.Path] | None = None, use_default_lib_search_paths: bool = True, resources_dir: str | pathlib.Path | None = None, logger: logging.Logger | None = None)#

Bases: abr_sdk.core.Application

Synthesize speech audio from input text.

Push raw UTF-8 text bytes and receive synthesized mono PCM audio, one signed 16-bit integer per sample in little-endian order at 16 kHz.

Streaming usage::

chunks: list[bytes] = []
with Tts("libabr-tts.so") as tts:
    tts.push(b"Hello world.", on_pcm=chunks.append)
    tts.wait_for_completion()
pcm = b"".join(chunks)

Initialization

Create a new object instance wrapping the given handle and ABI instance.

Arguments

cabi CABI instance providing access to the low-level C functions. A C ABI object may be obtained by loading an ABR SDK shared library.

handle Pointer at the ABR SDK object that should be wrapped by the new Handle instance. Must be non-None unless handle_may_be_none is True.

handle_may_be_none If True, handle may be None. Used for access to the library metadata.

class_ String containing the expected object class. If not set to None, then the given string is compared to the “class” property of the handle.

text_input_buffer: abr_sdk.core.Buffer#

None

FIFO byte queue that receives the raw UTF-8 text pushed into the model.

pcm_output_buffer: abr_sdk.core.Buffer#

None

FIFO byte queue that emits the synthesized PCM audio as the model produces it.

__enter__() abr_sdk.tts.Tts#
push(data: bytes, *, on_pcm: collections.abc.Callable[[bytes], None] | None = None, output_poll_timeout_ms: int = 0) None#

Push UTF-8 text bytes into the TTS pipeline.

Advances the event loop until all input has been accepted by the pipeline. PCM bytes produced while advancing are delivered to on_pcm.

_warn_if_not_preprocessed(data: bytes) None#

Warn once if pushed text was not run through tts_preprocess.

Raw digits or symbols reach the phoneme dictionary and get mangled or dropped. Best-effort: one warning per instance, on stderr so it never corrupts PCM written to stdout.

wait_for_completion() None#

Block until the TTS pipeline finishes synthesizing all queued text.

Marks the end of input so the pipeline flushes its networks, then drives the event loop until the application reports idle (every NN has consumed its input and produced nothing more).

close() None#

Close the TTS instance and release any active processor.

class abr_sdk.tts.Processor(tts: abr_sdk.tts.Tts, on_pcm: collections.abc.Callable[[bytes], None] | None = None)#

Event loop driving a streaming TTS session.

Initialization

process_and_wait_for_output(data: bytes | None, timeout_ms: int, drain_to_idle: bool) None#

Push data into TTS and advance the event loop until input is drained.

When drain_to_idle is true, keeps polling past input exhaustion until the application-idle event fires (all NNs finished consuming and produced no further PCM).

close() None#

Release the underlying event set; safe to call multiple times.

__enter__() abr_sdk.tts.Processor#
__exit__(*exc: Any) None#