System and method for generating and presenting multi-modal applications from intent-based markup scripts

US 7,020,841 B2
Filed: 06/07/2001
Issued: 03/28/2006
Est. Priority Date: 06/07/2001
Status: Active Grant

First Claim

Patent Images

1. A method for presenting an application in a plurality of modalities, comprising the steps of:

retrieving a modality-independent document from one of local and remote storage, wherein the modality-independent document is an intent-based document that describes user interaction with the application separate from application content and presentation;

parsing the modality-independent document using parsing rules obtained from one of local or remote storage;

converting the modality-independent document to a first intermediate representation that can be rendered by a speech user interface modality;

converting the modality-independent document to a second intermediate representation that can be rendered by a GUI (graphical user interface) modality;

building a cross-reference table by which the speech user interface can access components comprising the second intermediate representation;

rendering the first and second intermediate representations in their respective modality; and

receiving a user input in one of the GUI and speech user interface modalities to enable multi-modal interaction and control the document presentation.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Systems and methods are provided for rendering modality-independent scripts (e.g., intent-based markup scripts) in a multi-modal environment, whereby a user can interact with an application using a plurality of modalities (e.g., speech and GUI) with I/O events being automatically synchronized over the plurality of modalities presented. In one aspect, immediate synchronized rendering of the modality-independent document in each of the supported modalities is provided. In another aspect, deferred rendering and presentation of intent-based scripts to an end user is provided, wherein a speech markup language script (such as a VoiceXML document) is generated from the modality-independent script and rendered (via, e.g., VoiceXML browser) at a later time.

120 Citations

31 Claims

1. A method for presenting an application in a plurality of modalities, comprising the steps of:
- retrieving a modality-independent document from one of local and remote storage, wherein the modality-independent document is an intent-based document that describes user interaction with the application separate from application content and presentation;
  
  parsing the modality-independent document using parsing rules obtained from one of local or remote storage;
  
  converting the modality-independent document to a first intermediate representation that can be rendered by a speech user interface modality;
  
  converting the modality-independent document to a second intermediate representation that can be rendered by a GUI (graphical user interface) modality;
  
  building a cross-reference table by which the speech user interface can access components comprising the second intermediate representation;
  
  rendering the first and second intermediate representations in their respective modality; and
  
  receiving a user input in one of the GUI and speech user interface modalities to enable multi-modal interaction and control the document presentation.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
- - 2. The method of claim 1, wherein the GUI and speech user interface modalities are synchronized in the document presentation.
  - 3. The method of claim 1, wherein the first intermediate representation is stored in local system memory for immediate rendering.
  - 4. The method of claim 1, wherein the step of converting the modality-independent document to the first intermediate representation comprises transcoding the modality-independent document to a speech markup script.
  - 5. The method of claim 4, wherein the step of rendering comprises the step of deferred rendering of the speech markup script.
  - 6. The method of claim 4, wherein the speech markup script is stored on a local persistent storage device.
  - 7. The method of claim 4, wherein the speech markup script comprises VXML (Voice eXtensible Markup Language).
  - 8. The method of claim 1, further comprising the step of executing an applications program when a corresponding event call occurs within the modality-independent document.
  - 9. The method of claim 8, wherein the step of executing an applications program comprises updating existing grammar rules with data values returned from the applications program.
  - 10. The method of claim 8, wherein the step of executing an applications program comprises updating content values associated with a component of the modality-independent document using data values returned from the applications program.
  - 11. The method of claim 1, further comprising the step of registering a program to be executed upon completion of a specified event.
  - 12. A program storage device readable by machine, tangibly embodying a program of instructions executable by the machine to perform the method steps of claim 1.

13. A method for providing global help information when presenting a modality-independent document, the method comprising the steps of:
- preparing an internal representation of a structure and component attributes of the modality-independent document, wherein the modality-independent document is an intent-based document that describes user interaction with the application separate from application content and presentation;
  
  building a grammar comprising rules for resolving specific spoken requests;
  
  processing a spoken request utilizing the grammar rules; and
  
  presenting an aural description of the modality-independent document in response to the spoken request, wherein presenting an aural description of the modality-independent document comprises providing global help information by presenting document components, attributes, and methods of interaction.
- View Dependent Claims (14)
- - 14. A program storage device readable by machine, tangibly embodying a program of instructions executable by the machine to perform the method steps of claim 13.

15. A method for providing contextual help information when presenting a modality-independent document, the method comprising the steps of:
- preparing an internal representation of a structure and component attributes of the modality-independent document, wherein the modality-independent document is an intent-based document that describes user interaction with the application separate from application content and presentation;
  
  building a grammar comprising rules for resolving specific spoken requests;
  
  processing a spoken request utilizing the grammar rules; and
  
  presenting an aural description of the components, attributes, and methods of interaction of the modality-independent document in response to the spoken request to provide contextual help information.
- View Dependent Claims (16, 17)
- - 16. The method of claim 15, wherein the step of building a grammar comprises the step of combining values obtained from data stored in one of local storage, remote storage, and a combination thereof, with values obtained from an analysis of the modality-independent document.
  - 17. A program storage device readable by machine, tangibly embodying a program of instructions executable by the machine to perform the method steps of claim 15.

18. A method for providing feedback information when presenting a modality-independent document, the method comprising the steps of:
- preparing an internal representation of the structure and component attributes of the modality-independent document, wherein the modality-independent document is an intent-based document that describes user interaction with the application separate from application content and presentation;
  
  building a grammar comprising rules for resolving specific spoken requests;
  
  processing a spoken request and resolving the spoken request utilizing the grammar rules;
  
  obtaining state and value information regarding specified components of the document from the internal representation of the document; and
  
  presenting an aural description of the content values associated with document components in response to the spoken request to provide feedback information.
- View Dependent Claims (19, 20)
- - 19. The method of claim 18, wherein the step of building a grammar comprises the step of combining values obtained from data stored in one of local storage, remote storage, and a combination thereof, with values obtained from analysis of the document.
  - 20. A program storage device readable by machine, tangibly embodying a program of instructions executable by the machine to perform the method steps of claim 18.

21. A method for aurally spelling out content values associated with components of a modality-independent document, the method comprising the steps of:
- preparing an internal representation of a structure and component attributes of the modality-independent document, wherein the modality-independent document is an intent-based document that describes user interaction with the application separate from application content and presentation;
  
  building a grammar comprising rules for resolving specific spoken requests;
  
  processing a spoken request utilizing the grammar rules;
  
  obtaining state and content value information regarding specified components of the document from the internal representation of the document; and
  
  presenting each character of the content value information requested in response to the spoken request.
- View Dependent Claims (22, 23, 24)
- - 22. The method of claim 21, wherein the step of presenting each character of the content value information comprises the step of inserting pauses between each character of the content value information to be presented.
  - 23. The method of claim 21, wherein the step of building a grammar comprises the step of combining values obtained from data stored in one of local storage, remote storage, and a combination thereof, with values obtained from an analysis of the document.
  - 24. A program storage device readable by machine, tangibly embodying a program of instructions executable by the machine to perform the method steps of claim 21.

25. A system for presenting an application in a plurality of modalities, comprising:
- a multi-modal manager for parsing a modality-independent document to generate a traversal model that maps components of the modality-independent document to at least a first and second modality-specific representation, wherein the modality-independent document is an intent-based document that describes user interaction with the application separate from application content and presentation;
  
  a speech user interface manager for rendering and presenting the first modality-specific representation in a speech modality;
  
  a GUI (graphical user interface) manager for rendering and presenting the second modality-specific representation in a GUI modality;
  
  an event queue monitor for detecting GUI events;
  
  an event queue for storing captured GUI events; and
  
  a plurality of methods, that are called by the speech user interface manager, for synchronizing I/O (input/output) events across the speech and GUI modalities.
- View Dependent Claims (26, 27, 28, 29, 30, 31)
- - 26. The system of claim 25, wherein the methods for synchronizing I/O events comprise a first method for polling for the occurrence of GUI events in the event queue and a second method for reflecting speech events back to the GUI manager and posting speech events to the multi-modal manager.
  - 27. The system of claim 25, further comprising a method for invoking user-specified programs that are specified in the modality-independent document.
  - 28. The system of claim 25, wherein the multi-modal manager comprises a main renderer that instantiates the GUI manager, the speech user interface manager, and a method for capturing GUI events.
  - 29. The system of claim 25, wherein the speech user interface manager comprises JSAPI (Java speech application program interface).
  - 30. The system of claim 25, wherein the speech user interface manager comprises a VoiceXML browser.
  - 31. The system of claim 30, further comprising a transcoder for generating a VoiceXML script from the modality-independent document.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
International Business Machines Corporation
Original Assignee
International Business Machines Corporation
Inventors
Dantzig, Paul M., Filepp, Robert, Liu, Yew-Huey
Primary Examiner(s)
Bayerl, Raymond J.
Assistant Examiner(s)
Roswell, Michael

Application Number

US09/876,714
Publication Number

US 20030071833A1
Time in Patent Office

1,755 Days
Field of Search

345/428, 345/727, 345/728, 345/730, 345/760, 345/764, 345/716, 715/727, 715/728, 715/729, 715/730, 715/716, 715/718, 715/760, 715/764
US Class Current

715/727
CPC Class Codes

G06F 3/038 Control and interface arran...

System and method for generating and presenting multi-modal applications from intent-based markup scripts

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

120 Citations

31 Claims

Specification

Solutions

Use Cases

Quick Links

System and method for generating and presenting multi-modal applications from intent-based markup scripts

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

120 Citations

31 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links