A recent report regarding Smart Speaker adoption, stated that over 66 million people in the United States now own at least one smart speaker. Google and Amazon are making smart speakers, smarter, as they race to put them in every device that we own, and every room that we enter.
One of the things that make smart speakers appear smarter, is the ability to use different voices. With Google, you can change the default voice to an alternate one. And with Amazon smart speakers, you can use multiple voices. Using an alternate voice (or voices) can enhance the user's experience, and expand your audience.
In addition to changing voices, there are several other features to help you make your content sound, just right. These new features are referred to as SSML (or Speech Synthetic Markup Language). SSML is a markup language like HTML, that provides a way to enhance how your content sounds.
Let's look at a few examples of how SSML is used.
Listen to Alexa read this post :
Starting with the Content
Our content is composed of Diction, Grammar, and Style. Diction is our choice of words. Grammar is how we structure our words into sentences and paragraphs. And Style is how we choose to communicate an idea or thought. We tailor our Diction, Grammar, and Style to ensure that our brand engages our audience. Because the same set of words can be read in multiple ways, SSML helps the devices turn our content into audio.
Changing Cadence
Punctuation is used by text-to-speech tools to know when and how long to pause. In general, a comma causes a short pause, a period causes a longer pause. And the end of a paragraph, causes a slightly longer pause. But what if we want to
In addition to pauses, SSML also provides tags to
Changing Form
Heteronyms can cause challenges when turning words into speech. An example is the word spelled R E A D. Should this be pronounced like reed, or red? When the text to speech tool chooses the wrong way to pronounce a heteronym, SSML can provide the necessary instructions to fix the pronunciation.
Like heteronyms, sometimes a sequence of digits can prove ambiguous when being turned into speech, like the digits 2 5 1 9 4 9 4:
Should these be
or just the digits:
or should these be understood and
Changing Voice
While the SSML specification provides for changing voices, as of the writing of this post, only Amazon supports the Voice tag. Google only allows the creator of the Google Voice App to select one voice (from a list of four) to be used for all the Voice App interaction.
Amazon implemented the SSML Voice directive and currently has many English-speaking voices to choose from. In addition to both male and female voices, Amazon includes English voices from Australia, Great Britain, and India, as well as the United States. The ability to change the voice being used opens many possibilities to engage your audience. Not only can you change the default voice for a particular post, but you can also change the voice within a post.
Changing Language
In addition to the four English dialects, Amazon currently has five additional languages including French, German, Italian, Japanese, and Spanish. There are a total of 27 different voices available. With the increasingly popular translation tools, you can even have your content
Conclusion
As you can see (or hear), there are many options to customize the delivery of your content. As the voices continue to mature and the tools to customize the voice experience expand, smart speakers will provide a whole new way to see (I mean hear) your brand. Your audience has been shifting attention from computers to smartphones. Both Google and Amazon are now pushing voice technology as the new, frictionless way of getting information. Preparing your brand to engage your audience using voice will ensure your brand has both a visual, and verbal, presence. Create My Voice, can help you easily get on this new platform.