Usage ===== Installation ############ .. code-block:: bash pip install amazon-polly-streaming Requirements ############ * Python 3.13+ * AWS credentials in the default chain (env vars, profile, or IAM role) with ``polly:StartSpeechSynthesisStream`` permission * A region supporting Polly bidirectional streaming (``us-east-1``, ``us-west-2``, ``eu-central-1``, ``eu-west-2``, ``ap-southeast-1``, ``ca-central-1`` as of 2026-05) The ``polly:StartSpeechSynthesisStream`` IAM action is distinct from the classic ``polly:SynthesizeSpeech``: a role granting only the classic action cannot call the bidirectional endpoint and will fail with ``ValidationException``. Basic usage ########### Instantiate the client once for a region and call ``start_speech_synthesis_stream`` per utterance. The method is an async generator yielding audio bytes as Polly emits them, with no need to wait for the full audio to be generated server-side. .. code-block:: python import asyncio from amazon_polly_streaming import PollyStreamingClient async def main() -> None: client = PollyStreamingClient(region="eu-central-1") audio = b"" async for chunk in client.start_speech_synthesis_stream( text="hello world, how are you today", voice_id="Matthew", ): audio += chunk with open("hello.mp3", "wb") as fh: fh.write(audio) asyncio.run(main()) Configuration ############# ``start_speech_synthesis_stream`` accepts the following keyword arguments: * ``text`` (required): the text to synthesize * ``voice_id`` (required): a Polly voice id that supports generative bidirectional streaming (e.g. ``"Matthew"``, ``"Joanna"``, ``"Bianca"``); see the `Amazon Polly generative voices `_ page for the current list * ``engine``: defaults to ``"generative"`` (the only value supported by the bidirectional streaming API at the time of writing) * ``language_code``: BCP-47 code, defaults to ``"en-US"``; required only for bilingual voices * ``output_format``: ``"mp3"`` (default), ``"pcm"``, or ``"ogg_vorbis"`` * ``sample_rate``: a string with the sample rate in Hz, defaults to ``"24000"`` * ``use_pool``: defaults to ``True``; see `Connection pool`_ Region is bound at client construction time, not per call. To switch region, instantiate a new client. .. code-block:: python eu = PollyStreamingClient(region="eu-central-1") us = PollyStreamingClient(region="us-west-2") Error handling ############## Service-side errors are surfaced as typed exceptions in ``amazon_polly_streaming.exceptions``, mirroring the Polly API exception types documented for ``StartSpeechSynthesisStream``: * ``ServiceException``: base class for all service errors * ``ServiceFailureException``: an unexpected Polly service failure * ``ValidationException``: invalid input (e.g. unsupported voice id for bidirectional streaming) * ``ServiceQuotaExceededException``: account quota hit * ``ThrottlingException``: request rate exceeds the service throttle All concrete exceptions inherit from ``ServiceException``, so a single ``except`` covers them all: .. code-block:: python from amazon_polly_streaming import PollyStreamingClient, ServiceException client = PollyStreamingClient(region="eu-central-1") try: async for chunk in client.start_speech_synthesis_stream( text="hello", voice_id="Matthew" ): ... except ServiceException as exc: # `type(exc).__name__` carries the specific Polly exception type ... Catch a specific type for targeted handling, e.g. backoff on throttling: .. code-block:: python import asyncio from amazon_polly_streaming import PollyStreamingClient, ThrottlingException client = PollyStreamingClient(region="eu-central-1") for attempt in range(3): try: async for chunk in client.start_speech_synthesis_stream( text="hello", voice_id="Matthew" ): ... break except ThrottlingException: await asyncio.sleep(2**attempt) Transport failures (HTTP non-2xx, TLS or HTTP/2 negotiation, missing credentials) are raised as ``RuntimeError`` and are distinct from ``ServiceException``. Connection pool ############### Each ``PollyStreamingClient`` instance owns an HTTP/2 connection pool that reuses connections across calls, amortizing the TLS handshake and HTTP/2 SETTINGS exchange. In the common pattern of caching one client per process (boto3 convention), there is one pool per process too. Each lease holds the connection for the duration of one Polly stream; concurrent calls (e.g. broadcast fan-out to multiple target languages) get distinct connections from the pool up to the configured size. The pool size is set via the ``pool_size`` constructor parameter, default ``8``: .. code-block:: python # default: up to 8 concurrent Polly streams without queueing client = PollyStreamingClient(region="eu-central-1") # custom: up to 16 concurrent streams (e.g. fan-out to 16 target languages) client = PollyStreamingClient(region="eu-central-1", pool_size=16) Beyond the configured size, additional concurrent calls wait for an active lease to be released. The queueing is correct (no errors) but adds latency equal to the remaining duration of an in-flight stream. Disable the pool with ``use_pool=False`` for one-shot scripts where the pool's bookkeeping is overhead without benefit: .. code-block:: python async for chunk in client.start_speech_synthesis_stream( text="hello", voice_id="Matthew", use_pool=False ): ... For latency-sensitive workloads doing many short utterances (e.g. streaming captions), keep ``use_pool=True``: the first call opens a connection, subsequent calls lease idle connections from the pool and return them on completion. AWS credentials ############### Credentials are resolved via the `AWS SDK default credential chain `_: #. ``AWS_ACCESS_KEY_ID`` / ``AWS_SECRET_ACCESS_KEY`` (with optional ``AWS_SESSION_TOKEN``) environment variables #. AWS profile via ``AWS_PROFILE`` env var or ``~/.aws/credentials`` #. IAM instance role (EC2) or container role (ECS, Fargate, EKS) #. SSO via ``~/.aws/sso`` The resolved identity must have an IAM policy granting ``polly:StartSpeechSynthesisStream`` on ``*`` (Polly does not support resource-level permissions for this action). The classic ``polly:SynthesizeSpeech`` action does **not** authorize the bidirectional streaming endpoint. Caching the client ################## Each ``PollyStreamingClient`` instance is light: construction does not open any connection or perform any I/O. Still, instantiating it once and reusing it mirrors the boto3 client pattern and keeps region configuration in one place. A ``functools.cache`` decorator on a factory function is a common idiom: .. code-block:: python import os from functools import cache from amazon_polly_streaming import PollyStreamingClient @cache def get_polly_client() -> PollyStreamingClient: return PollyStreamingClient(region=os.environ["AWS_REGION"]) async def synthesize(text: str, voice_id: str) -> bytes: client = get_polly_client() audio = b"" async for chunk in client.start_speech_synthesis_stream( text=text, voice_id=voice_id ): audio += chunk return audio The HTTP/2 connection pool inside the library is already shared process-wide, so the cache is purely for client-instance reuse, not for connection reuse.