Posted: 06 Apr 2018 Contributor: Matthew J Fritschle
From Typing to Talking: How Voice Is Becoming the New Keyboard
Whether it’s HAL in 2001: A Space Odyssey, the ship’s computer in Star Trek, J.A.R.V.I.S. in the Marvel Cinematic Universe, or R2-D2, C-3PO and now BB-8 in the Star Wars Expanded Universe, voice technology has been a Sci Fi staple for decades. Ever since we got our first taste of the possibilities, we’ve been envisioning a future where keyboards are obsolete and our voices reign supreme. Well, the future is now and now we’re in the middle of a revolution championing voice as the new keyboard.
From Typing to Talking: The Basics of Voice Technology
Essentially made up of two components — speech and voice recognition — voice technology offers an alternative to typing. While the former converts spoken words to digital text, the latter identifies speakers based on features of speech, such as intonation, pitch and style. Both use artificial intelligence (AI) branches like natural-language processing (NLP) and deep learning to ‘understand’ what is being said, and voice recognition goes beyond by actually zeroing in on each speaker to identify who’s saying what.
While both are important pillars of voice technology, our focus today is speech recognition because it’s what will actually be replacing keyboards. Voice recognition will come in as well, but that’s farther down the line, when it becomes more mainstream and we have more need of distinguishing between multiple speakers who are dictating at the same time.
From Typing to Talking: How Voice Is Becoming the New Keyboard
We’ve had speech recognition for a while now, but it wasn’t until recent years that its popularity increased to the point where we’re choosing it over typing. It makes sense too; speech comes naturally to us and, for most, speaking is one of the first things we learn in life. The question is why now? — why are we now finding it preferable to speak instead of type? Is it because technology has finally advanced enough that it can support a voice-only ‘keyboard’? Is it because we have a mobile affinity and are becoming accustomed to asking Siri and Alexa for directions when we’re looking for a coffee shop? Is it because modern technology is getting smaller and smaller, and just aren’t big enough for a keyboard? Or is it a little bit of all three?
Catalyst 1: Advancements in Voice Technology
We can’t opt for our voices over keyboards if the technology isn’t there to support it. After all, no matter how much we want something, without a base, there’s no support. In terms of voice technology, that base comes in the form of AI and NLP.
AI has had its good and bad years since its inception, but recently it’s all been good. Surges in investment into the field have spurred rapid growth in its capabilities, and the same can be said for natural-language processing. We touched on NLP earlier, and now we’re going to expand on that by saying that it’s a branch of AI that seeks to bridge the gap between what is said and what is meant. For example, predictive typing (yes, we’re talking about replacing typing with our voices, but bear with me for this example). Predictive typing uses context to ‘guess’ what you’re going to type next; it doesn’t just choose a word willy-nilly, it looks at the whole sentence and determines the next logical word-choice.
Along the same lines, speech recognition is improving with context. A couple of years ago, using speech recognition to send a text saying “can you take the trash out?” may have produced this text: “amputate the trash out.” Today, advances in AI have drastically reduced the possibility of that happening. Instead, AI looks at the rest of the message and considers the likelihood that you were really saying “amputate the trash out.” Because it’s not very likely, it considers the possibility that it misheard you and, with the help of countless algorithms running in the background, ends up with the correct text.
Catalyst 2: Our Mobile Affinity
Ever since smartphones came into our lives, we’ve been smitten. We fell in love with the first swipe, fell harder with the first tap, and now we can’t fathom a future without them by our side. In other words, we have a mobile affinity.
Our love for mobile devices has, somewhat, translated to a love (or like) of digital assistants like Siri and Alexa, who populate pretty much every smartphone platform out there. This love or like for digital assistants has, in turn, turned to a preference for voice over keyboard. Consider that when we were first introduced to Siri, tons of videos were uploaded online of people asking her random things to get funny responses. Nowadays, those videos still come by (especially for Alexa), but we’re now using them as actual digital assistants instead of fun quirks to pass the time.
Therefore, when we want to call someone, a simple, “Siri, call Matthew” does the job. When we want to text someone, “Siri, text mom that I’ll call her later” works just fine. When we want to search, “Siri, what is artificial intelligence” will pull up the best response. Our voices are perfect for direct commands; we’re realizing that we can do a lot more with our voices and, by bypassing unlocking our phones and opening an app, we can do those things much faster as well.
Similarly, using our voices is far safer than typing when you include driving in the equation. Staring at a screen and typing between glances at the road is simply not safe, and using our voices is a much safer alternative for sending a text when your hands should be on the wheel.
Catalyst 3: The Diminishing Size of Modern Technology
Finally, many modern technologies simply aren’t large enough to house easy-to-use keyboards, hence a preference for our voices. Sure, you can fit a keyboard in a smartphone, but for people whose hands aren’t exactly petite, this becomes a nuisance. The same can be said for other technologies like smartwatches that are being made with functionalities that make use of text, but either don’t have an in-house keyboard or have one that’s cumbersome to use.
Putting It All Together
Putting it all together, there’s no one factor that’s prodding the adoption of voice over keyboards. Rather, it’s a combination many things. Advancements in AI and voice technology are making it possible for speech recognition to accurately capture what we’re saying, which is incentivizing the usage of digital assistants for more and more tasks. Along a similar line, improvements in technology are allowing us to include text-based features in devices that previously did not have them, and because they’re too small to house easy-to-use keyboards, voice is coming out on top as the go-to solution.