Microsoft's Speech Recognition Technology More Human-Like Than Ever

Microsoft has announced that their speech recognition technology has reached human parity. In other words, their system is more human than ever.

The word error rate of Microsoft's speech recognition tech is now at a low 5.9 percent, according to the company's researchers. That figure puts the tech almost at par with professional transcribers who participated in the tests. The transcribers and the system were asked to transcribe the same recordings. The results were not too far from each other.

Their findings led Xuedong Huang, the company's chief speech scientist, to declare that the speech recognition system has "reached human parity". He further described the feat as "a historic achievement".

The tech utilizes neural language models according to The Verge. These models group words that are similar together to allow a more efficient generalization.

Microsoft's Speech and Dialog research group did admit that the technology is still far from having the ability to understand semantics and contextual awareness. They also indicated that the real test for the speech recognition tech is for it to understand conversations in real-life situations. This will likely include the ability to recognize facial expressions and understanding different languages. It also needs to work well when faced with a wider selection of voices.

Microsoft is planning to use this technology in their personal voice assistant, Cortana. The company's first foray to speech recognition involved Skype. With the release of Skype Translator in 2015, users have been able to talk to other people who speak a different language. Skype Translator recognizes the conversation and translates them from another language.

In describing their work, Microsoft AI Research head Harry Shum said that "We are moving away from to world where people must understand computers in a world in which computers must understand us." Shum also said, as MacRumors reported, that "true artificial intelligence is still on the distant horizon."

Microsoft's Speech Recognition Technology More Human-Like Than Ever

More from iTechPost

Apple TV+ Breaks Records With 81 Emmy Nominations

Chrome Cuts Off Support for Big Sur Macs This August

OnePlus Adds Plus Mind AI to OnePlus 13 Phones, Drops Dual-Charging Cable Tonight

Apple Invests $500M to Recycle Rare Earths and Build US Supply Chain

Sign Up for the iTechPost Newsletter