|
|
Just Speak Naturally
What's That You Said?
This exchange of information sounds simple, but it's actually the complex product of more than 30 years of research in statistics, physics, linguistics, and computer science. What's so complicated about asking for a phone number, and what does it mean for your future? The idea of talking to machines isn't new. Characters in science fiction stories have conversed with robots and computers for a long time. You may have shared a few words yourself with a computer, car, or cellular phone,especially when they were not working as you expected.
But nowadays, these machines may understand what you are saying and can respond, all thanks to recent breakthroughs in the field of speech recognition. As a result, you can do the following and more:
You can hear examples of speech recognition in action. You'll need to use the
You Make the Call
Recognizing Speech
Before the system can recognize what you are saying, it converts the individual sounds into digitized sound waves, which it matches to a built-in dictionary. With almost 40 phonemes in the English language, there are millions of possibilities of how these sounds could be combined. The title of this story looks different in the form of a sound wave. The wave takes its shape from the spoken words. Hear and see this sound wave in action by clicking the image below. You'll need the free Flash plug-in.
The speech recognition system figures out the correct choice through a series of algorithms, or mathematical models, that help narrow down the possibilities to ones that make the most sense. These algorithms also take grammar into account: For instance, if you say "I am going to the beach," the system will know that the subject "I" will take the verb "am" rather than "are" and that the preposition "to" will likely be used if you are "going"somewhere.
A Slow Start
The same comparison holds for the development of the speech recognition field. Only three years ago,anyone dictating to a computer had to speak s-l-o-w-l-y and in short phrases punctuated by long pauses. The results that appeared on screen were often more comical than accurate. Were you saying, "I scream" or "ice cream," for example? Other speech recognition programssuch as the ones a telephone user might encounterwere famous for their limited options: "Press or say '1' "or "Say 'yes' if you wish to continue."
Quantum Leaps
"If you have an algorithm that takes one minute to recognize a sentence that is five seconds long, it's not that useful. You need to recognize words in real time," he says. And he notes that new algorithms are able to better recognize words, grammar, and the beginnings and endings of sentences. He adds that speech recognition systems can now handle thousands of words from speakers of different accents, genders, and ages. When these people say the same word, the way it sounds will differ. The system uses millions of additional instructions to recognize these differences. "In the past, you had simple applications that understood just a few words. Now you can say, 'I'm interested in the weather tomorrow,' " Bohrer explains. "To make a speech recognition system speaker-independent for thousands of words, the system must learn huge amounts of data. There are thousands of variations repeated for each phoneme, so the system will work whether you have a New Orleans or a Boston accent." One speech recognition company has developed a program that recognizes the voice and speech patterns of computer users ages 11 and up. Way cool!
Brave New World
Software that automatically translates one spoken language into another will help bridge different cultures. Other programs understand the complexities of medical and legal terminology. Bohrer says that over the next five years, we'll start giving commands to handheld personal organizers or cellular phones, especially since these devices will connect to the Internet. But he also warns about the present limits to this emerging technology. So far, he says, machines have needed to know the context of what the speaker is talking about. If you are calling an automated weather or stock quote service, the software on the other end is prepared to talk weather and stocks and not much else. "Whenever you have automatic systems, you can always fool them," Bohrer observes. "Don't expect them to be have like human beings or to use real intelligence. A system set up for travel information expects you to say 'I would like to travel on Monday from Zurich to Vienna.' If you say 'Please order me a pizza,' there will be funny things coming out."
Learn More
Related Resources
|