Readium Speech is a TypeScript library for implementing a read aloud feature with Web technologies. It follows best practices gathered through interviews with members of the digital publishing industry.
While this project is still in a very early stage, it is meant to power the read aloud feature for two different Readium projects: Readium Web and Thorium.
Readium Speech was spun out as a separate project in order to facilitate its integration as a shared component, but also because of its potential outside of the realm of ebook reading apps.
- Extracting Guided Navigation objects from a document (or a fragment of a document)
- Generating utterances from these Guided Navigation objects
- Processing utterances (prepending/appending text to utterances based on context, pronunciation through SSML/PLS…)
- Voice selection
- TTS playback
- Highlighting
For our initial work on this project, we're focusing on voice selection based on recommended voices.
The outline of this work has been explored in a GitHub discussion and through a best practices document.
It's currently under review in a draft PR.
A live demo of the voice selection API is available.
It demonstrates the following features:
- fetching a list of all available languages, translating them to the user's locale and sorting them based on these translations
- returning a list of voices for a given language, grouped by region and sorted based on quality
- filtering languages and voices based on gender and offline availability
- using embedded test utterances to demo voices