Amazon Web Services (AWS): Polly: Points to remember

Let's learn about Amazon Polly:

  1. Polly is a service that turns text into lifelike speech.

  2. Polly supports Speech Synthesis Markup Language (SSML) tags like prosody so users can adjust the speech rate, pitch or volume.

  3. Polly is a secure service that delivers benefits at high scale and at low latency.

  4. Users can cache and replay Amazon Polly’s generated speech at no additional cost.

  5. Users can use Polly to power their application with high-quality spoken output.

  6. Users can synthesize speech for certain Neural voices using the Newscaster style, to make them sound like a TV or Radio newscaster.

  7. Users can detect when specific words or sentences in the text are being spoken to the user based on the metadata included in the audio stream.

  8. Polly generates Speech Marks using the following four elements: Sentence, Word, Viseme and SSML.

  9. Polly can be used in announcement systems in public transportation and industrial control systems for notifications and emergency announcements.

  10. Applications such as quiz games, animations, avatars or narration generation are common use-cases for cloud-based Text-to-speech solution like Polly.

  1. Cloud-based text-to-speech (Polly) is platform independent, so it minimizes development time and effort.

  2. Polly supports all the programming languages included in the AWS SDK (Java, Node.js, .NET, PHP, Python, Ruby, Go and C++) and AWS Mobile SDK (iOS/Android).

  3. Polly supports an HTTP API so users can implement their own access layer.

  4. Polly supports MP3, Vorbis and raw PCM audio stream formats.

  5. Polly is a HIPAA Eligible Service covered under the AWS Business Associate Addendum (AWS BAA).

  6. Polly makes it easy to request an additional stream of metadata with information about when particular sentences, words and sounds are being pronounced.

  7. Polly's pay-per-use model means there are no setup costs. User can start small and scale up as their application grows.

  8. Polly provides simple API operations that users can easily integrate with their existing applications.

  9. Polly has a Neural TTS (NTTS) system that can produce even higher quality voices than its standard voices. The NTTS system produces the most natural and human-like text-to-speech voices possible.

  10. Neural voices aren't available in all AWS Regions, nor do they support all Polly features.

  1. Polly provides API operations that users can use to store lexicons in an AWS region.

  2. Lexicons give additional control over how Polly pronounces words uncommon to the selected language.

  3. The SynthesizeSpeech operation produces audio in near-real time, with relatively little latency in most cases.

  4. Polly's Asynchronous Synthesis feature overcomes the challenge of processing a larger text document by changing the way the document is both synthesized and returned.

  5. With the Polly plugin for WordPress, users can provide visitors to their WordPress website audio recordings of their content.

A Points to remember series by Piyush Jalan.