Voice browser dialog enabler for a communication system

US 20040138890A1
Filed: 01/09/2003
Published: 07/15/2004
Est. Priority Date: 01/09/2003
Status: Active Grant

First Claim

Patent Images

1. A voice browser dialog enabler for a communication system, the browser enabler comprising:

a speech recognition application comprising a plurality of units of application interaction, wherein each unit has associated voice dialog forms defining fragments;

a voice browser driver, the voice browser driver resident on a communication device;

the voice browser driver providing the fragments from the application and generating identifiers that identify the fragments; and

a voice browser implementation resident on a remote voice server, the voice browser implementation receiving the fragments from the voice browser driver and downloading a plurality of speech grammars, wherein subsequent input speech is matched against those speech grammars associated with the corresponding identifiers received in a speech recognition request from the voice browser driver.

View all claims

4 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A voice browser dialog enabler for multimodal dialog uses a multimodal markup document with fields have markup-based forms associated with each field and defining fragments. A voice browser driver resides on a communication device and provides the fragments and identifiers that identify the fragments. A voice browser implementation resides on a remote voice server and receives the fragments from the driver and downloads a plurality of speech grammars. Input speech is matched against those speech grammars associated with the corresponding identifiers received in a recognition request from the voice browser driver.

Citations

20 Claims

1. A voice browser dialog enabler for a communication system, the browser enabler comprising:
- a speech recognition application comprising a plurality of units of application interaction, wherein each unit has associated voice dialog forms defining fragments;
  
  a voice browser driver, the voice browser driver resident on a communication device;
  
  the voice browser driver providing the fragments from the application and generating identifiers that identify the fragments; and
  
  a voice browser implementation resident on a remote voice server, the voice browser implementation receiving the fragments from the voice browser driver and downloading a plurality of speech grammars, wherein subsequent input speech is matched against those speech grammars associated with the corresponding identifiers received in a speech recognition request from the voice browser driver.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The voice browser enabler of claim 1, wherein the speech recognition request and subsequent speech recognition results are markup-based.
  - 3. The voice browser enabler of claim 1, wherein the fragments consist of a VoiceXML page of identified forms.
  - 4. The voice browser enabler of claim 3, wherein the fragments are cached by the voice browser implementation.
  - 5. The voice browser enabler of claim 1, wherein the speech recognition application is a multimodal browser that processes multimodal markup documents, and the voice browser driver is resident in a voice browser stub that operates on a multimodal markup document to split the multimodal markup document into a displayable markup portion and a voice markup portion, and wherein the voice browser driver and voice browser implementation are operable on the voice markup portion.
  - 6. The voice browser enabler of claim 5, further comprising an Internet application server with web server containing the multimodal markup document and the speech grammars.
  - 7. The voice browser enabler of claim 5, further comprising a visual browser in the communication device that is operable on both the displayable markup portion and a voice markup portion of the multimodal markup document.

8. A voice browser for multimodal dialog in a communication system, the browser comprising:
- a multimodal markup document split into a displayable markup portion and a voice markup portion comprising fields, wherein the fields have associated forms defining fragments of the document page;
  
  a voice browser stub including a voice browser driver portion of a voice browser, the voice browser driver resident on a communication device;
  
  the voice browser stub generating the fragments and the voice browser driver generating identifiers that identify the fragments; and
  
  a voice browser implementation portion of the voice browser resident on a remote voice server, the voice browser implementation downloading the fragments from the voice browser stub and downloading a plurality of speech grammars, wherein subsequent input speech is matched against those speech grammars associated with the corresponding identifiers received in a speech. recognition request from the voice browser driver.
- View Dependent Claims (9, 10, 11, 12)
- - 9. The voice browser of claim 8, wherein the fragments are VoiceXML forms.
  - 10. The voice browser of claim 8, wherein the speech recognition request and subsequent speech recognition results are markup-based.
  - 11. The voice browser of claim 8, further comprising an Internet application server with web server containing the multimodal markup document and the speech grammars.
  - 12. The voice browser of claim 8, further comprising a visual browser in the communication device that is operable on both the displayable markup portion and a voice markup portion of the multimodal markup document.

13. A method for enabling dialog with a voice browser for a communication system, the method comprising the steps of:
- providing a voice browser driver resident on a communication device and a voice browser implementation containing a plurality speech grammars resident on a remote voice server;
  
  running a speech recognition application comprising a plurality of units of application interaction, wherein each unit has associated voice dialog forms defining fragments;
  
  defining identifiers associated with each fragment;
  
  supplying the fragments to the voice browser implementation;
  
  focusing on a field in one of the units of application interaction;
  
  sending a speech recognition request including the identifier of the form associated with the focused field from the voice browser driver to the voice browser implementation;
  
  inputting and recognizing speech;
  
  matching the speech to the acceptable speech grammar associated with the identifier; and
  
  obtaining speech recognition results.
- View Dependent Claims (14, 15, 16, 17, 18, 19, 20)
- - 14. The method of claim 13, wherein the speech recognition request of the sending step and the speech recognition results of the obtaining steps are markup-based.
  - 15. The method of claim 13, wherein the supplying step includes supplying the voice browser implementation with a VoiceXML page of identified forms.
  - 16. The method of claim 13, wherein the providing step includes the voice browser driver incorporated with a synchronizer into a voice browser stub that interfaces with the voice browser implementation and a visual browser on the communication device.
  - 17. The method of claim 13, wherein the synchronizer is operable to enable an input from an Internet server when speech is detected.
  - 18. The method of claim 13, wherein the running step includes downloading a multimodal markup document as the speech recognition application document.
  - 19. The method of claim 18, wherein after the running step further comprising the step of splitting the multimodal markup document into a displayable markup portion and a voice markup portion containing the units of interaction, and wherein the subsequent steps are operable for only the voice markup portion of the document.
  - 20. The method of claim 19, wherein the providing step includes providing a visual browser in the communication device that is operable on both the displayable markup portion and a voice markup portion of the multimodal markup document

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Google Technology Holdings LLC (Alphabet Inc.)
Original Assignee
Motorola, Inc. (Motorola Solutions, Inc.)
Inventors
Randolph, Mark, Vogedes, Jerome, Pearce, Michael, Ferrans, James, Engelsma, Jonathan

Granted Patent

US 7,003,464 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/270.1
CPC Class Codes

G10L 15/26   Speech to text systems G10L...

H04M 1/72445   for supporting Internet bro...

H04M 2207/40   terminals with audio html b...

H04M 2250/74   with voice recognition means

H04M 3/4938   comprising a voice browser ...

Voice browser dialog enabler for a communication system

First Claim

4 Assignments

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Voice browser dialog enabler for a communication system

First Claim

4 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links