System and method for generating and presenting multi-modal applications from intent-based markup scripts

US 20030071833A1
Filed: 06/07/2001
Published: 04/17/2003
Est. Priority Date: 06/07/2001
Status: Active Grant

First Claim

Patent Images

1. A method for presenting an application in a plurality of modalities, comprising the steps of:

retrieving a modality-independent document from one of local and remote storage;

parsing the modality-independent document using parsing rules obtained from one of local or remote storage;

converting the modality-independent document to a first intermediate representation that can be rendered by a speech user interface modality;

converting the modality-independent document to a second intermediate representation that can be rendered by a GUI (graphical user interface) modality;

building a cross-reference table by which the speech user interface can access components comprising the second intermediate representation;

rendering the first and second intermediate representations in their respective modality; and

receiving a user input in one of the GUI and speech user interface modalities to enable multi-modal interaction and control the document presentation.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Systems and methods for rendering modality-independent scripts (e.g., intent-based markup scripts) in a multi-modal environment and, in particular, to a multi-modal user interface for an application, whereby the user can interact with the application using a plurality of modalities (e.g., speech and GUI). The multi-modal interface automatically synchronizes I/O events over the plurality of modalities presented. In one aspect, the system provides immediate, synchronized rendering of the modality-independent document in each of the supported modalities. In another aspect, the system provides deferred rendering and presentation of intent-based scripts to an end user, wherein the system comprises a transcoder for generating a speech markup language script (such as a VoiceXML document) from the modality-independent script and rendered (via, e.g., VoiceXML browser) at a later time.

112 Citations

33 Claims

1. A method for presenting an application in a plurality of modalities, comprising the steps of:
- retrieving a modality-independent document from one of local and remote storage;
  
  parsing the modality-independent document using parsing rules obtained from one of local or remote storage;
  
  converting the modality-independent document to a first intermediate representation that can be rendered by a speech user interface modality;
  
  converting the modality-independent document to a second intermediate representation that can be rendered by a GUI (graphical user interface) modality;
  
  building a cross-reference table by which the speech user interface can access components comprising the second intermediate representation;
  
  rendering the first and second intermediate representations in their respective modality; and
  
  receiving a user input in one of the GUI and speech user interface modalities to enable multi-modal interaction and control the document presentation.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
- - 2. The method of claim 1, wherein the GUI and speech user interface modalities are synchronized in the document presentation.
  - 3. The method of claim 1, wherein the first intermediate representation is stored in local system memory for immediate rendering.
  - 4. The method of claim 1, wherein the step of converting the modality-independent document to the first intermediate representation comprises transcoding the modality-independent document to a speech markup script.
  - 5. The method of claim 4, wherein the step of rendering comprises the step of deferred rendering of the speech markup script.
  - 6. The method of claim 4, wherein the speech markup script is stored on a local persistent storage device.
  - 7. The method of claim 4, wherein the speech markup script comprises VXML (Voice eXtensible Markup Language).
  - 8. The method of claim 1, further comprising the step of executing an applications program when a corresponding event call occurs within the modality-independent document.
  - 9. The method of claim 8, wherein the step of executing an applications program comprises updating existing grammar rules with data values returned from the applications program.
  - 10. The method of claim 8, wherein the step of executing an applications program comprises updating content values associated with a component of the modality-independent document using data values returned from the applications program.
  - 11. The method of claim 1, further comprising the step of registering a program to be executed upon completion of a specified event.
  - 12. The method of claim 1, wherein the modality-independent document comprises an intent-based document.
  - 13. A program storage device readable by machine, tangibly embodying a program of instructions executable by the machine to perform the method steps of claim 1.

14. A method for providing global help information when presenting a modality-independent document, the method comprising the steps of:
- preparing an internal representation of a structure and component attributes of the modality-independent document;
  
  building a grammar comprising rules for resolving specific spoken requests;
  
  processing a spoken request utilizing the grammar rules; and
  
  presenting an aural description of the modality-independent document in response to the spoken request.
- View Dependent Claims (15, 16)
- - 15. The method of claim 14, wherein the step of presenting an aural description of the modality-independent document comprises presenting one of document components, attributes, methods of interaction, and a combination thereof.
  - 16. A program storage device readable by machine, tangibly embodying a program of instructions executable by the machine to perform the method steps of claim 14.

17. A method for providing contextual help information when presenting a modality-independent document, the method comprising the steps of:
- preparing an internal representation of a structure and component attributes of the modality-independent document;
  
  building a grammar comprising rules for resolving specific spoken requests;
  
  processing a spoken request utilizing the grammar rules; and
  
  presenting an aural description of the components, attributes, and methods of interaction of the modality-independent document in response to the spoken request.
- View Dependent Claims (18, 19)
- - 18. The method of claim 17, wherein the step of building a grammar comprises the step of combining values obtained from data stored in one of local storage, remote storage, and a combination thereof, with values obtained from an analysis of the modality-independent document.
  - 19. A program storage device readable by machine, tangibly embodying a program of instructions executable by the machine to perform the method steps of claim 17.

20. A method for providing feedback information when presenting a modality-independent document, the method comprising the steps of:
- preparing an internal representation of the structure and component attributes of the modality-independent document;
  
  building a grammar comprising rules for resolving specific spoken requests;
  
  processing a spoken request and resolving the spoken request utilizing the grammar rules;
  
  obtaining state and value information regarding specified components of the document from the internal representation of the document; and
  
  presenting an aural description of the content values associated with document components in response to the spoken request.
- View Dependent Claims (21, 22)
- - 21. The method of claim 20, wherein the step of building a grammar comprises the step of combining values obtained from data stored in one of local storage, remote storage, and a combination thereof, with values obtained from analysis of the document.
  - 22. A program storage device readable by machine, tangibly embodying a program of instructions executable by the machine to perform the method steps of claim 20.

23. A method for aurally spelling out content values associated with components of a modality-independent document, the method comprising the steps of:
- preparing an internal representation of a structure and component attributes of the modality-independent document;
  
  building a grammar comprising rules for resolving specific spoken requests;
  
  processing a spoken request utilizing the grammar rules;
  
  obtaining state and content value information regarding specified components of the document from the internal representation of the document; and
  
  presenting each character of the content value information requested in response to the spoken request.
- View Dependent Claims (24, 25, 26)
- - 24. The method of claim 23, wherein the step of presenting each character of the content value information comprises the step of inserting pauses between each character of the content value information to be presented.
  - 25. The method of claim 23, wherein the step of building a grammar comprises the step of combining values obtained from data stored in one of local storage, remote storage, and a combination thereof, with values obtained from an analysis of the document.
  - 26. A program storage device readable by machine, tangibly embodying a program of instructions executable by the machine to perform the method steps of claim 23.

27. A system for presenting an application in a plurality of modalities, comprising:
- a multi-modal manager for parsing a modality-independent document to generate a traversal model that maps components of the modality-independent document to at least a first and second modality-specific representation;
  
  a speech user interface manager for rendering and presenting the first modality-specific representation in a speech modality;
  
  a GUI (graphical user interface) manager for rendering and presenting the second modality-specific representation in a GUI modality;
  
  an event queue monitor for detecting GUI events;
  
  an event queue for storing captured GUI events; and
  
  a plurality of methods, that are called by the speech user interface manager, for synchronizing I/O (input/output) events across the speech and GUI modalities.
- View Dependent Claims (28, 29, 30, 31, 32, 33)
- - 28. The system of claim 27, wherein the methods for synchronizing I/O events comprise a first method for polling for the occurrence of GUI events in the event queue and a second method for reflecting speech events back to the GUI manager and posting speech events to the multi-modal manager.
  - 29. The system of claim 27, further comprising a method for invoking user-specified programs that are specified in the modality-independent document.
  - 30. The system of claim 27, wherein the multi-modal manager comprises a main renderer that instantiates the GUI manager, the speech user interface manager, and a method for capturing GUI events.
  - 31. The system of claim 27, wherein the speech user interface manager comprises JSAPI (java speech application program interface).
  - 32. The system of claim 27, wherein the speech user interface manager comprises a VoiceXML browser.
  - 33. The system of claim 32, further comprising a transcoder for generating a VoiceXML script from the modality-independent document.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
International Business Machines Corporation
Original Assignee
International Business Machines Corporation
Inventors
Dantzig, Paul M., Filepp, Robert, Liu, Yew-Huey

Granted Patent

US 7,020,841 B2
Time in Patent Office

Days
Field of Search
US Class Current

345/700
CPC Class Codes

G06F 3/038 Control and interface arran...

System and method for generating and presenting multi-modal applications from intent-based markup scripts

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

112 Citations

33 Claims

Specification

Solutions

Use Cases

Quick Links

System and method for generating and presenting multi-modal applications from intent-based markup scripts

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

112 Citations

33 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links