Menu Home


Hi everyone, and welcome to another exciting edition of Boring JavaScript! Today, we cover the SpeechSynthesisUtterance object – specifically, some of the important properties that you can set to change volume, pitch, speed – even voices!

Don’t like to read? Then watch our video.

What Did You Say?

Creating an Utterance is as simple as creating a new instance of a SpeechSynthesisUtterance. Once that is created, all you need to do is set some properties, and you’re done. Let’s take a look:

const getUtterance = () => {
	const utterance = new SpeechSynthesisUtterance(textToSpeech.value);
	utterance.lang = language.options[language.selectedIndex].value;
	utterance.pitch = parseFloat(pitch.value);
	utterance.rate = parseFloat(rate.value);
	const [ languageSelected, index ] = voice.options[voice.selectedIndex].value.split(',');
	utterance.voice = languages.get(languageSelected)[index];
	utterance.volume = parseFloat(volume.value);
	return utterance;

const utterance = getUtterance();

Line 2 shows how you can create a new instance. From that point, you can set any number of properties. The ones mentioned above are:

  1. Language (Line 3). This is the language (or, actually, the voice inflection or accent) of the speaker. This is limited by the software installed on your computer. See the next section for more details.
  2. Pitch (Line 4). You can make the voice higher (or lower) than normal
  3. Rate (Line 5). The speed at which the text is spoken. You can go really fast here.
  4. Voice (Line 7). The voice to use. This is dependent upon the language (see #1 above), and is also dependent upon what is possible with your system.
  5. Volume (Line 8). How loud (or soft) to make the voice.

Again, You Said What?

One thing you need to know is what languages and voices are supported by your (or by your users’) system. You can use the ‘speechSynthesis’ built-in object to get these values. Which is easy enough – except that if you call the method immediately, you will get nothing back. Why?

Getting the languages and voices is an asynchronous call to the operating system. To help, you have an event that is fired when those languages and voices have been received. Here’s how to use it:

const languages = new Map();

const populateVoicesAndLanguage = () => {
	speechSynthesis.getVoices().forEach((voice) => {
		const { lang, name } = voice;
		if (!languages.has(lang)) {
			languages.set(lang, []);
		const languagesFromMap = languages.get(lang);
		languages.set(lang, languagesFromMap.concat([voice]));
speechSynthesis.addEventListener('voiceschanged', populateVoicesAndLanguage);

The event we’re looking for it the ‘voiceschanged’ event (Line 14). Once we detect that is fired, then we know that the operating system has delivered all the available voices to JavaScript. We can then use the ‘getVoices()’ method (Line 5) to look at all the voices and do whatever we need with them (in our case, we’re loading up a mapping).

Easier Said Than Done

This is one of those blog posts that works better with the video, so I encourage you to look at the video to see (and hear) and actual example.

The Video

As always, we include a video with the blog post.

Shameless Plug

You can find us everywhere!

Github Repository:

Check out everything at:





Categories: Boring JavaScript Javascript

Tagged as:


Web Tinkerer. No, not like Tinkerbell.

Creator of the game Virtuoid. Boring JavaScript. Visit us at

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: