abr_sdk.tts#
API for running TTS (text-to-speech) applications.
Author: Pawel Jaworski
Module Contents#
Classes#
API#
- class abr_sdk.tts.Tts(lib_or_path: str | pathlib.Path | abr_sdk.core.Library, *, lib_search_paths: list[str | pathlib.Path] | None = None, use_default_lib_search_paths: bool = True, resources_dir: str | pathlib.Path | None = None, logger: logging.Logger | None = None)#
Bases:
abr_sdk.core.ApplicationSynthesize speech audio from input text.
Push raw UTF-8 text bytes and receive synthesized mono PCM audio, one signed 16-bit integer per sample in little-endian order at 16 kHz.
Streaming usage::
chunks: list[bytes] = [] with Tts("libabr-tts.so") as tts: tts.push(b"Hello world.", on_pcm=chunks.append) tts.wait_for_completion() pcm = b"".join(chunks)Initialization
Create a new object instance wrapping the given handle and ABI instance.
Arguments
cabi CABI instance providing access to the low-level C functions. A C ABI object may be obtained by loading an ABR SDK shared library.
handle Pointer at the ABR SDK object that should be wrapped by the new
Handleinstance. Must be non-Noneunless handle_may_be_none isTrue.handle_may_be_none If
True, handle may beNone. Used for access to the library metadata.class_ String containing the expected object class. If not set to
None, then the given string is compared to the “class” property of the handle.- text_input_buffer: abr_sdk.core.Buffer#
None
FIFO byte queue that receives the raw UTF-8 text pushed into the model.
- pcm_output_buffer: abr_sdk.core.Buffer#
None
FIFO byte queue that emits the synthesized PCM audio as the model produces it.
- __enter__() abr_sdk.tts.Tts#
- push(data: bytes, *, on_pcm: collections.abc.Callable[[bytes], None] | None = None, output_poll_timeout_ms: int = 0) None#
Push UTF-8 text bytes into the TTS pipeline.
Advances the event loop until all input has been accepted by the pipeline. PCM bytes produced while advancing are delivered to
on_pcm.
- _warn_if_not_preprocessed(data: bytes) None#
Warn once if pushed text was not run through
tts_preprocess.Raw digits or symbols reach the phoneme dictionary and get mangled or dropped. Best-effort: one warning per instance, on stderr so it never corrupts PCM written to stdout.
- wait_for_completion() None#
Block until the TTS pipeline finishes synthesizing all queued text.
Marks the end of input so the pipeline flushes its networks, then drives the event loop until the application reports idle (every NN has consumed its input and produced nothing more).
- close() None#
Close the TTS instance and release any active processor.
- class abr_sdk.tts.Processor(tts: abr_sdk.tts.Tts, on_pcm: collections.abc.Callable[[bytes], None] | None = None)#
Event loop driving a streaming TTS session.
Initialization
- process_and_wait_for_output(data: bytes | None, timeout_ms: int, drain_to_idle: bool) None#
Push
datainto TTS and advance the event loop until input is drained.When
drain_to_idleis true, keeps polling past input exhaustion until the application-idle event fires (all NNs finished consuming and produced no further PCM).
- close() None#
Release the underlying event set; safe to call multiple times.
- __enter__() abr_sdk.tts.Processor#
- __exit__(*exc: Any) None#