Amazon Polly
- Amazon Polly is the opposite of Amazon Transcribe.
- Definition:
- This service allows you to turn text into lifelike speech using deep learning and enables you to create applications that will talk.
- For example, if you write "Hi, my name is Stephane, and this is a demo of Amazon Polly," then the speech is going to be generated for you by Amazon Polly.
Advanced Features
Polly has several advanced features that may appear in the exam:
Lexicons
- you Define how to read certain pieces of text
- Example: you may Write "AWS" but want Polly to pronounce "Amazon Web Services"
- Example: you may Write "W3C" but want Polly to say "World Wide Web Consortium"
SSML (Speech Synthesis Markup Language)
- Markups that indicate how your text should be pronounced
- Example: "Hello" + break + "how are you?" will say "Hello," then have a long break, then "how are you?"
- It won't say "Hello, break, how are you?" – it understands the markup
- Capabilities include:
- Whispering
- Pronunciation control
- Abbreviation handling
- Word emphasis

Voice Engines
Multiple voice engines available, from most historical to newest:
- Neural
- Standard
- Long-form
- Generative
The newest engines have very good human-like voices.
Speech Marks
- Provides information about where audio elements occur
- Shows where a word or sentence starts or ends in the audio
- Polly gives you both the audio and the speech marks
- Very helpful for:
- Lip-syncing
- Highlighting words as they are spoken