Voice controlled computers and electronics have always been a staple science fiction, flaunting with the idea that we could simply issue commands to our silicone based underlings and have them do our bidding. Even though technology has come an incredibly long way in the past couple decades understanding natural language is still a challenge that remains unconquered. Modern day speech recognition systems often rely on key words in order to perform the required commands, usually forcing the user to use unnatural language in order to get what they want. Apple’s latest innovation, Siri, seems to be a step forward in this regard and could potentially signal in a shift in the way people use their smartphones and other devices.
On the surface Siri appears to understand quite a bit of natural language, being able to understand that a single task can be said in several different ways. Siri also appears to have a basic conversational engine in it as well so that it can interpret commands in the context of what you’ve said to it before. The scope of what Siri can do however is quite limited but that’s not necessarily a bad thing as being able to nail a handful of actions from natural language is still leaps and bounds above what other voice recognition systems are currently capable of.
Siri also has a sense of humour, often replying to out of left field questions with little quips or amusing shut downs. I was however disappointed with the response for a classic nerd line of “Tea. Earl Grey. Hot” which recieved the following response:
This screen shot also shows that Siri’s speech recognition isn’t always 100% either, especially when it’s trying to guess what you were saying.
Many are quick to draw the comparison between Android’s voice command system and apps available on the platform like Vlingo. The big difference there though is that these services are much more like search engines than Siri, performing the required actions only if you utter the commands and key words in the right order. That’s the way nearly all voice operated systems have worked in the past (like those automated call centres that everyone hates) and are usually the reason why most people are disappointed in them. Siri has the one up here as people are being encouraged to speak to it in a natural way, rather than changing the way they speak in order to be able to use it.
For all the good that Siri is capable of accomplishing it’s still at it’s heart a voice recognition system and with that comes some severe limitations. Ambient noise, including others talking around you, will confuse Siri completely making it unusable unless you’re in relatively quite area. I’m not just saying this as a general thing either, friends with Siri have mentioned this as one of its short comings. Of course this isn’t unique to Siri and is unlikely to be a problem that can be overcome by technology alone (unless you could speak to Siri via a brain implant, say).
Like many other voice recognition systems Siri is geared more toward the accent of the country it was developed in, I.E. American. This isn’t just limited to the different spellings between say the Queen’s English and American English but also for the inflections and nuances that different accents introduce. Siri will also fall in a crying heap if the pronunciation and spelling are different as well, again limiting its usefulness. This is a problem that can and has been overcome in the past by other speech recognition systems and I would expect that with additional languages for Siri already on the way that these kinds of problems will eventually be solved.
A fun little fact that I came across in my research for this post was that Apple still considered Siri to be a beta product (right at the bottom, in small text that’s easy to miss). That’s unusual for Apple as they’re not one to release a product unfinished, even if that comes at the cost of features not making it in. In a global sense Siri really is still beta with some of her services, like Yelp and location based stuff, not being available to people outside of the USA (like the above screenshot shows). Apple is of course working to make them all available but it’s quite unusual for them to do something in this fashion.
So is Siri the next step in user interfaces? I don’t think so. It’s a great step forward for sure and there will be people who make heavy use of it in their daily activities. However once the novelty wears off and the witty responses run out I don’t see a compelling reason for people to continue using Siri. The lack of a developer API as well (and no mention of whether one will be available) means that the services that can be hooked into Siri are limited to those that Apple will develop, meaning some really useful services might never be integrated forcing users to go back to native apps. Depending on how many services are excluded people may just find it easier to not use Siri at all, opting for the already (usually quite good) native app experience. I could be proven wrong on this, especially with technology like Watson on the horizon, but for now Siri’s more of a curiosity than anything else.