• Channels
Part Inventory
Go
 
powered by:

 
  • Quick Poll
What Social Networking site do you use the most?



VOTE VIEW RESULTS
Previous Polls

Premium Content

New Signal Chain Technical Papers from Texas Instruments:

 

 

 

Speech Recognition Gets Intuitive, Finds New Life In Mobile Applications


Mark David

February 16, 2006

Print
Reprints Comment Subscribe

Speech recognition has been one of those technologies that's perpetually "just around the corner." But after a recent IBM press event showcasing advances in speech technology and a spate of impressive applications, I'm convinced the age of speech recognition has dawned at last.

For years, IBM has pursued dictation applications. I tried IBM's ViaVoice for the PC some 10 years ago to dictate a magazine column. While it was fun and fairly efficient, there were too many MPMs (manglings per minute) to make it practical.

A decade later, IBM is still chasing the dictation holy grail. In 2001, the company says, speech input lagged human keyboarding by a factor of 10. Now, the company predicts we'll have machines that transcribe better than people by the end of the decade.

But all the while IBM has chased dictation's receding goal posts, the company also has been effectively re-channeling its speech expertise into "command recognition." These interactive phone-based applications have successfully moved speech recognition from the realm of sci-fi to an automation technique giving offshore call centers a run for their rupees. Tens of thousands of telephony and call-center applications now use speech input/output.

The proliferation of cell phones, PDAs, and other portable data devices has created mass-market opportunities for speech recognition and response systems. Bluetooth wireless headsets invite mobile users to multitask, naturally driving demand for hands-free application control or query and response. Automotive telematics create another opportunity for speech, enabling drivers to keep their eyes on the road while commanding an ever-expanding array of in-car electronics systems.

I was impressed by the demonstrations of IBM's Embedded ViaVoice 4.4 and its new freeform command recognition capabilities. The technology eliminates the need for users to memorize predefined control terms. Instead, it uses statistical language modeling and semantic interpretation to accept intuitive command phrases.

In contrast, my cell phone requires me to say, "call someone" to voice-activate the dialing function. If I forget the command and say "make a call" or "place call," it typically screws up and starts voice message playback. Then I have to take my eyes off the road, pick up the phone, perhaps swerve dangerously, and utter some choice freeform commands of my own.

At least in the confines of the demo room, the freeform commands worked as advertised. Demos included integrated car audio, phone dialing, and navigation system control. Context recognition was announced for an XM Satellite Radio hands-free interface, based on Embedded ViaVoice integrated into VoiceBox Navigator from VoiceBox Technologies.

VoiceBox offers intelligent searches by determining the context of a user's requests, whether searching for music or asking for driving instructions. If I tried changing a radio station by saying "change station," the system might ask me to specify an FM frequency, a type of music, or whether I wanted to scan the available stations.

Beyond the vehicle, IBM sees the natural opportunity to make speech a "multimodal" option for grabbing "info ondemand" on mobile devices. Several excellent demonstrations showed how speech could be used to efficiently fill data fields in PDA-enabled applications, like insurance claims inspection and mobile stock trading.

WIRELESS AT WAKE FOREST
With today's college students coming of age with the cell phone, it's not surprising that the coolest applications were presented by Anne Bishop, director of Information Systems R&D at Wake Forest University. Named the "most wired" liberal arts university by Yahoo, Wake Forest has worked with IBM since 1995 as a "ThinkPad" school.

With 95% of its students carrying cell phones, Bishop says it makes sense for the university to integrate mobile technology into college life. So, Wake Forest offers PocketPC-powered smart phones and many services to enhance both academics and campus living.

The first applications that incorporate speech recognition involve the campus shuttle system and dorm laundry. For more about the unwired college lifestyle, go to www. electronicdesign.com and see Drill Deeper 12024.

See Associated Figure

Speech recognition has been one of those technologies that's perpetually "just around the corner." But after a recent IBM press event showcasing advances in speech technology and a spate of impressive applications, I'm convinced the age of speech recognition has dawned at last.

For years, IBM has pursued dictation applications. I tried IBM's ViaVoice for the PC some 10 years ago to dictate a magazine column. While it was fun and fairly efficient, there were too many MPMs (manglings per minute) to make it practical.

A decade later, IBM is still chasing the dictation holy grail. In 2001, the company says, speech input lagged human keyboarding by a factor of 10. Now, the company predicts we'll have machines that transcribe better than people by the end of the decade.

But all the while IBM has chased dictation's receding goal posts, the company also has been effectively re-channeling its speech expertise into "command recognition." These interactive phone-based applications have successfully moved speech recognition from the realm of sci-fi to an automation technique giving offshore call centers a run for their rupees. Tens of thousands of telephony and call-center applications now use speech input/output.

The proliferation of cell phones, PDAs, and other portable data devices has created mass-market opportunities for speech recognition and response systems. Bluetooth wireless headsets invite mobile users to multitask, naturally driving demand for hands-free application control or query and response. Automotive telematics create another opportunity for speech, enabling drivers to keep their eyes on the road while commanding an ever-expanding array of in-car electronics systems.

I was impressed by the demonstrations of IBM's Embedded ViaVoice 4.4 and its new freeform command recognition capabilities. The technology eliminates the need for users to memorize predefined control terms. Instead, it uses statistical language modeling and semantic interpretation to accept intuitive command phrases.

In contrast, my cell phone requires me to say, "call someone" to voice-activate the dialing function. If I forget the command and say "make a call" or "place call," it typically screws up and starts voice message playback. Then I have to take my eyes off the road, pick up the phone, perhaps swerve dangerously, and utter some choice freeform commands of my own.

At least in the confines of the demo room, the freeform commands worked as advertised. Demos included integrated car audio, phone dialing, and navigation system control. Context recognition was announced for an XM Satellite Radio hands-free interface, based on Embedded ViaVoice integrated into VoiceBox Navigator from VoiceBox Technologies.

VoiceBox offers intelligent searches by determining the context of a user's requests, whether searching for music or asking for driving instructions. If I tried changing a radio station by saying "change station," the system might ask me to specify an FM frequency, a type of music, or whether I wanted to scan the available stations.

Beyond the vehicle, IBM sees the natural opportunity to make speech a "multimodal" option for grabbing "info ondemand" on mobile devices. Several excellent demonstrations showed how speech could be used to efficiently fill data fields in PDA-enabled applications, like insurance claims inspection and mobile stock trading.

WIRELESS AT WAKE FOREST
With today's college students coming of age with the cell phone, it's not surprising that the coolest applications were presented by Anne Bishop, director of Information Systems R&D at Wake Forest University. Named the "most wired" liberal arts university by Yahoo, Wake Forest has worked with IBM since 1995 as a "ThinkPad" school.

With 95% of its students carrying cell phones, Bishop says it makes sense for the university to integrate mobile technology into college life. So, Wake Forest offers PocketPC-powered smart phones and many services to enhance both academics and campus living.

The first applications that incorporate speech recognition involve the campus shuttle system and dorm laundry. For more about the unwired college lifestyle, go to www. electronicdesign.com and see Drill Deeper 12024.

See Associated Figure

Average (0 Ratings):

Subscribe
Subscribe to Electronic Design and start receiving more articles like this one
Filed Under:

Check for price and availability on Source ESB:

Go
powered by  
    There are no comments to display. Be the first one!
You must log on before posting a comment.

Are you a new visitor? Register Here
Acceptable Use Policy

Sponsored Links