“Ok Google”: What Are the Problems with Speech Recognition Technology?

Let’s face it – only people in high-up positions can afford to have their own personal assistant. The rest of us?! Well, using speech recognition software will suffice. Speech recognition technology has been on the rise in recent years and more and more people use it to dictate their words and control the computer.

Although this seemed like an unrealistic thing several years ago, now it’s a standard in every new gadget. However, speech recognition technology is still being developed and has a few issues to work through. There are a number of technological hurdles which need to be overcome, including safety issues, and failing to understand different accents, dialects, especially when there is a background noise.

Here is all you need to know about this prominent software, from history, the way it works, to the issues it faces on an everyday basis.

Brief History of the Voice Recognition Technology

The voice recognition technology has been growing in the past five decades. Back in 1976, computers were able to understand only about 1,000 words, while in the 1980s, the number jumped to 20,000. It was in that period when IBM started developing voice recognition technology. DragonDictate, the first speech recognition product, appeared in 1990 by Dragon. The first product which could recognize continuous speech was launched in 1996 by IBM.

Back in the second half of the 2000s, when smartphones first appeared, Google launched the Voice Search app for the iPhone. Siri, a prominent voice recognition assistant, was launched in 2003 by Apple. Moreover, several other software have been introduced over the years, including Microsoft Cortana, Google Voice, Apple’s Siri, and Amazon Alexa.

Google Speech Recognition Technology

What you probably didn’t know is the fact that the first ever recorded attempt at speech recognition technology was made in 1,000 A.D. Then, several centuries later, Bell developed a new system called “Audrey” which could recognize the numbers 1-9 spoken by a single voice. As you can see, the history of the speech recognition technology has been a long and ongoing one.

How Does Speech Recognition Technology Work?

You think that the work of speech recognition technology is quite simple and we all take it for granted. What if I tell you that this process is quite complicated. To understand it better, let’s compare the speech recognition technology to a child who has just started learning a language. This technology works in the same way and goes through the same struggle as a child when trying to adopt new verbal patterns. When it comes to speech recognition technology, plenty of manpower, research and innovation is required. This is so because there are thousands of languages and dialects to which speech recognition technology has to adapt to. To perfect these systems, we will need plenty of more work and effort.

The speech recognition software has four approaches to turning spoken language into written words:

  • Pattern and feature analysis (each word you pronounce is broken into bits);
  • Simple pattern making (each word you pronounce is entirely recognized);
  • Artificial neutral networks (models which can recognize patterns after being trained);
  • Language modelling and statistical analysis (works by similarity and guessing of words and grammar which is known);

Problems with Speech Recognition Technology

Over the years, the speech recognition technology and the voice user interfaces have significantly improved and only make errors about 5.5 percent of the time. When compared to humans, it’s almost the same error rate. As the speech recognition technology is commonplace in our everyday life, it’s normal that there are some issues associated with it. Hence, these are some of the most common problems associated with speech recognition technology.

Ok Google

Voice Complexities

One of the most common issues when it comes to speech recognition technology is the voice. As this technology doesn’t place commas if the speaker doesn’t speak a comma. So, it is quite normal that many misspelt words will appear, especially because people have a different voice when they speak complex sentences.

Accents, Speed, and Speech

We are all different, and some of us speak quite slowly while others incredibly fast. Understanding people’s accents and speech can be an arduous task both for humans and technology. Thus, this is another issue which the speech recognition technology is facing. This technology can only record words which it easily understands.

Background Noise Interference

This is another issue of the voice recognition software, as background noise can affect the quality of your dictation. As this software helps you record your ideas hands-free, you must be careful where you use it. The noise in the background can lower the ability of speech recognition technology to recognize speech patterns.

Effectiveness and Time Costs

You might not realize it at first, but dictating to speech recognition software is really time-consuming because you need to review and even correct some mistakes afterwards. Moreover, it’s necessary to use the right speed and tone to eliminate spelling mistakes. Learning to use the proper pace requires plenty of time. What’s more, using the speech recognition technology frequently and for some time can indeed affect your speech and flow.


There are many people who assume that this software is constantly listening to you. However, this isn’t true as these devices and apps work only in the passive listening mode and are activated when their “wake-up words,” are said. The only issue when it comes to safety is the fact that Amazon allowed access to transcripts of audio recordings to third-party app developers. The company claims that this could only be done with users’ consent.

On the other hand, Google Home has already given such access. A similar concern is the reach of law enforcement. However, it can only be done in response to a subpoena. Moreover, using speech recognition technology might make you a target for hackers who can access anything connected to the internet.


Besides the rapid development of voice recognition technology, it still experiences various problems. Some of the most common issues reported by this software are the difficulty to differentiate accents, speech, and voice complexities, Moreover, this technology also faces information security and protection challenges. Each of these technological hurdles needs to be solved in the shortest time period as the success of voice recognition technology depends on them. However, one thing is for sure, none of the companies with voice recognition software will stop pushing their products with VUIs. After all, money makes the world go round.

Ann is a passionate writer who enjoys creating texts, articles, blogs and anything that has to do with writing. She is a lover of new things, especially new technology and challenges. Her writing has appeared in Paperwritten and WritersDepartment, among others.