Skip navigation

Voice Portals and VoiceXML, Part 1

Voice Portals and Voice Extensible Markup Language (VoiceXML) are valuable technologies that will soon have a big impact on the wireless industry. Voice Portals are advanced, voice-activated, natural-language interfaces that permit voice access to Internet-hosted content in much the same way that Web browsers and Wireless Application Protocol (WAP) microbrowsers do.

Voice Portals work on any Public Switched Telephone Network (PSTN) phone, without special services or software added to the device. To get an idea of the possibilities, call (800) 555-TELL, a free consumer Voice Portal and ask for your area's traffic information. I use this feature often to avoid traffic jams on my way home from work.

Voice portals might loosen the stranglehold of Computer Telephony Interfaces (CTI), the annoying chains of recorded prompts used by many companies to direct incoming calls. You must press 1 for "sales," 2 for "customer support," and so on. Most people hate these frustrating interfaces because you must listen through every step to get the information you want.

Voice portals differ because they operate with nearly natural language. The caller simply names the desired division, such as "support" or "employee directory," and the application forwards the call appropriately. If the particular Voice Portal application handles Mixed Initiative Dialogs, you can interrupt the process to ask a question. Thee application recognizes your words and sends your call directly to the information you want.

Many businesses and consumers could use Voice Portals for interactive transactional applications or information services. Voice Portals could be stand-alone applications or integrated with WAP and other types of wireless applications (e.g., call a Voice Portal from a WAP application).

Voice Portals can be developed with a number of technologies, some proprietary. VoiceXML, a language for specifying voice dialogs, lets you create Voice Portals using simple XML-based tags. Expect VoiceXML standardized specifications to become the foundation of Voice-Portal technology. VoiceXML's primary functionality areas include

  • Audio Prompts
  • Text to Speech (TTS)
  • Touch Tone Keys using dual-tone multifrequency (DTMF)
  • Automatic Speech Recognition (ASR)

VoiceXML content can be developed with technology and techniques similar to that used for regular Web sites. For example, a number of the VoiceXML applications we have developed are hosted on IIS and were developed with Visual Studio and Active Server Pages. The only difference is the markup language: VoiceXML tags. The VoiceXML Forum, the governing body for VoiceXML specifications, released VoiceXML 1.0 in March 2000. Forum members include Motorola, ATT, Lucent, and IBM as founding members and nearly 500 members total.

The W3C, an organization dedicated to standardizing markup and voice-browser functionality, is cooperating with the VoiceXML forum to develop VoiceXML 2.0. W3C seeks to expand Web access to let people interact with Web sites via spoken commands using any telephone. The group believes this accessibility will be a boon to people with visual impairments or those needing hands-free Web access.

http://www.voicexml.org/
http://www.w3.org/Voice/.

The next Wireless & Mobile UPDATE will include some code examples and more detail on the features of the Voice Browser, TTS, and ASR.

Hide comments

Comments

  • Allowed HTML tags: <em> <strong> <blockquote> <br> <p>

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.
Publish