Hi everyone, and welcome to another exciting edition of Boring JavaScript! Today, we cover the SpeechSynthesisUtterance object – specifically, some of the important properties that you can set to change volume, pitch, speed – even voices!
Don’t like to read? Then watch our video.
What Did You Say?
Creating an Utterance is as simple as creating a new instance of a SpeechSynthesisUtterance. Once that is created, all you need to do is set some properties, and you’re done. Let’s take a look:
const getUtterance = () => {
const utterance = new SpeechSynthesisUtterance(textToSpeech.value);
utterance.lang = language.options[language.selectedIndex].value;
utterance.pitch = parseFloat(pitch.value);
utterance.rate = parseFloat(rate.value);
const [ languageSelected, index ] = voice.options[voice.selectedIndex].value.split(',');
utterance.voice = languages.get(languageSelected)[index];
utterance.volume = parseFloat(volume.value);
return utterance;
}
const utterance = getUtterance();
speechSynthesis.speak(utterance);
Line 2 shows how you can create a new instance. From that point, you can set any number of properties. The ones mentioned above are:
- Language (Line 3). This is the language (or, actually, the voice inflection or accent) of the speaker. This is limited by the software installed on your computer. See the next section for more details.
- Pitch (Line 4). You can make the voice higher (or lower) than normal
- Rate (Line 5). The speed at which the text is spoken. You can go really fast here.
- Voice (Line 7). The voice to use. This is dependent upon the language (see #1 above), and is also dependent upon what is possible with your system.
- Volume (Line 8). How loud (or soft) to make the voice.
Again, You Said What?
One thing you need to know is what languages and voices are supported by your (or by your users’) system. You can use the ‘speechSynthesis’ built-in object to get these values. Which is easy enough – except that if you call the method immediately, you will get nothing back. Why?
Getting the languages and voices is an asynchronous call to the operating system. To help, you have an event that is fired when those languages and voices have been received. Here’s how to use it:
const languages = new Map();
const populateVoicesAndLanguage = () => {
languages.clear();
speechSynthesis.getVoices().forEach((voice) => {
const { lang, name } = voice;
if (!languages.has(lang)) {
languages.set(lang, []);
}
const languagesFromMap = languages.get(lang);
languages.set(lang, languagesFromMap.concat([voice]));
});
}
speechSynthesis.addEventListener('voiceschanged', populateVoicesAndLanguage);
The event we’re looking for it the ‘voiceschanged’ event (Line 14). Once we detect that is fired, then we know that the operating system has delivered all the available voices to JavaScript. We can then use the ‘getVoices()’ method (Line 5) to look at all the voices and do whatever we need with them (in our case, we’re loading up a mapping).
Easier Said Than Done
This is one of those blog posts that works better with the video, so I encourage you to look at the video to see (and hear) and actual example.
The Video
As always, we include a video with the blog post.
Shameless Plug
You can find us everywhere!
Github Repository: https://github.com/TheVirtuoid/boringjavascript
Check out everything at: https://www.thevirtuoid.com
Facebook: https://www.facebook.com/TheVirtuoid
Twitter: https://twitter.com/TheVirtuoid
YouTube: https://www.youtube.com/channel/UCKZ7CV6fI7xlh7zIE9TWqgw
Discord: https://discord.gg/M2Hb6r628r
Categories: Boring JavaScript Javascript
thevirtuoid
Web Tinkerer. No, not like Tinkerbell.
Creator of the game Virtuoid. Boring JavaScript. Visit us at thevirtuoid.com
Leave a Reply