The Future of Voice in the Enterprise

by Eric Tobias - Partner

Part of our annual holiday tradition at High Alpha is the selection of a technology related gift for our staff. In the winter of 2015, my partner Mike suggested the Amazon Echo. We wanted our staff to experience the forthcoming voice revolution first-hand.

As we fast-forward 13 months, it’s incredible to see what has happened. As Azeem Azhar wrote about last week, there is a double exponential going on with the Echo (and Alexa, the voice assistant embedded in it) right now.

It is a clear milestone in the shift to voice as the primary interface.

First of all, deployment and usage of Alexa has spiked. By looking at downloads and usage of the Alexa app (available for iOS and Android) it’s possible to proxy Echo installations. Echo users need to download the app to configure the device.

The number of monthly users of the official Alexa app in the US alone increased five-fold in 90 days to Christmas. In Q1 last year, Alexa downloads were running at about 80–90k per month; by Q3 this has risen to 150k per month. November saw 500k downloads and December 2.5m downloads. Total US installed base is now estimated at 5m (about 4% of households.)

The second growth is the number of ‘skills’ that Alexa has. A ‘skill’ is essentially a service — like a newspaper, food delivery service or game — that you can access via Alexa. This time last year, Alexa had fewer than 100 skills, that had grown 10x by June 2016, and Alexa ended the year with 7,000 skills.

Alexa is riding the wave of dramatic improvements in automated speech recognition driven by deep learning. There have been pretty impressive improvements all being deployed rapidly to the Google, Baidu and Amazon clouds.

What makes ‘voice first’ interesting:

It is a very natural way for us get things done. We can dispense with the metaphors of forms, drop downs, menu bars, icons and form filling and replace it with a more natural modality: speech.

It kills the advertising and pay-per-click model that has made Google over the past two decades. This hasn’t escaped Google, says Sridhar Ramaswamy, Google’s SVP of Ads: “one thing that we are all clear about is the days of three top text ads followed by ten organic results is a thing of the past in the voice first world.”

It creates new choke points which businesses will need to navigate in order to reach their customers. Read Ben Thompson on how Alexa creates an operating system-like positioning for Amazon.

It might reduce smartphone distraction. Browsing by voice is not (yet) as easy as browsing on a phone but a voice interface does provide a simple way to handle phone like tasks (weather, reminders, time, ordering) without picking up a device.

Gartner group reckons that 20% of all smartphone interactions will happen by voice by 2020. Early data suggests that homes with an Echo spend 10% more at Amazon during the first six months of ownership. Alexa is also coming to VW and Ford cars.

At High Alpha, we spend lots of time thinking about the future of enterprise software. And we focus extra attention to design, UI, and UX as it relates to this topic. Over the past 5 years, we’ve helped lead the trend of business applications looking and working more and more like the consumer applications we all use on a daily basis. As we ponder where the consumerization of the enterprise is headed, we start by asking ourselves what the future of work will look like?

Today, employees primarily access information through well-designed forms made accessible via their work laptop or mobile phone. But this simplistic approach is rapidly changing. You don’t need to look further than my 11-year old daughter to understand the needs of future employees. Abbie Grace uses Siri to make lists, lookup answers to her homework, calculate math problems, text her friends, and skip songs on her playlists. In other words, she utilizes voice as the primary interface to her phone.

It’s very possible, as voice technology continues to improve and as consumers become more and more comfortable using it, that future workers will demand a majority of their interactions with software be via voice. As we think about the shift from desktops to laptops and then laptops to mobile phones that we’ve seen in the past 10 years, it’s fairly easy to see how we will use our voice.

I’m going to continue to explore how the consumer experiences being embraced today will impact the enterprise experiences of tomorrow.

It’s not a question of if but when.

Credits: Azeem Azhar: The Exponential View