Dynamically extending the speech prompts of a multimodal application

US 8,290,780 B2
Filed: 06/24/2009
Issued: 10/16/2012
Est. Priority Date: 06/24/2009
Status: Active Grant

First Claim

Patent Images

1. A method of dynamically extending the speech prompts of a multimodal application, the method implemented with a prompt generation engine, a module of automated computing machinery operating on a multimodal device supporting multiple modes of user interaction with the multimodal application, the modes of user interaction including a voice mode and one or more non-voice modes, wherein the voice mode includes accepting speech input from a user, digitizing the speech, and providing digitized speech to a speech engine, and wherein the non-voice mode includes accepting input from a user through physical user interaction with a user input device for the multimodal device;

wherein the multimodal device comprises a module of automated computing machinery for executing the multimodal application and supports execution of a media file player, a module of automated computing machinery for playing media files;

the method comprising;

receiving, by the prompt generation engine, a media file having a metadata container;

retrieving, by the prompt generation engine from the metadata container, a speech prompt related to content stored in the media file for inclusion in the multimodal application; and

modifying, by the prompt generation engine, the multimodal application to include the speech prompt.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Dynamically extending the speech prompts of a multimodal application including receiving, by the prompt generation engine, a media file having a metadata container; retrieving, by the prompt generation engine from the metadata container, a speech prompt related to content stored in the media file for inclusion in the multimodal application; and modifying, by the prompt generation engine, the multimodal application to include the speech prompt.

Citations

18 Claims

1. A method of dynamically extending the speech prompts of a multimodal application, the method implemented with a prompt generation engine, a module of automated computing machinery operating on a multimodal device supporting multiple modes of user interaction with the multimodal application, the modes of user interaction including a voice mode and one or more non-voice modes, wherein the voice mode includes accepting speech input from a user, digitizing the speech, and providing digitized speech to a speech engine, and wherein the non-voice mode includes accepting input from a user through physical user interaction with a user input device for the multimodal device;
- wherein the multimodal device comprises a module of automated computing machinery for executing the multimodal application and supports execution of a media file player, a module of automated computing machinery for playing media files;
  
  the method comprising;
  
  receiving, by the prompt generation engine, a media file having a metadata container;
  
  retrieving, by the prompt generation engine from the metadata container, a speech prompt related to content stored in the media file for inclusion in the multimodal application; and
  
  modifying, by the prompt generation engine, the multimodal application to include the speech prompt.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The method of claim 1 wherein retrieving, by the prompt generation engine, from the metadata container a speech prompt related to content stored in the media file for inclusion in the multimodal application further comprises retrieving a text string prompt for execution by a text to speech engine.
  - 3. The method of claim 1 wherein retrieving, by the prompt generation engine, from the metadata container a speech prompt related to content stored in the media file for inclusion in the multimodal application further comprises retrieving an audio prompt to be played by the multimodal device.
  - 4. The method of claim 1 wherein retrieving, by the prompt generation engine, from the metadata container a speech prompt related to content stored in the media file for inclusion in the multimodal application further comprises identifying a tag for prompts in the metadata container.
  - 5. The method of claim 4 wherein identifying a tag for prompts in the metadata container further comprises identifying a frame for prompts in an ID3 container of an MPEG media file.
  - 6. The method of claim 1 wherein modifying, by the prompt generation engine, the multimodal application to include the speech prompt further comprises updating a prompt document with the retrieved speech prompt.

7. An apparatus for dynamically extending the speech prompts of a multimodal application, the apparatus including a prompt generation engine and a multimodal application operating on a multimodal device supporting multiple modes of user interaction with the multimodal application, the modes of user interaction including a voice mode and one or more non-voice modes, the apparatus comprising a computer processor and a computer memory operatively coupled to the computer processor, the computer memory having disposed within it computer program instructions for:
- receiving, by the prompt generation engine, a media file having a metadata container;
  
  retrieving, by the prompt generation engine from the metadata container, a speech prompt related to content stored in the media file for inclusion in the multimodal application; and
  
  modifying, by the prompt generation engine, the multimodal application to include the speech prompt.
- View Dependent Claims (8, 9, 10, 11, 12)
- - 8. The apparatus of claim 7 wherein computer program instructions for retrieving, by the prompt generation engine, from the metadata container a speech prompt related to content stored in the media file for inclusion in the multimodal application further comprise computer program instructions for retrieving a text string prompt for execution by a text to speech engine.
  - 9. The apparatus of claim 7 wherein computer program instructions for retrieving, by the prompt generation engine, from the metadata container a speech prompt related to content stored in the media file for inclusion in the multimodal application further comprise computer program instructions for retrieving an audio prompt to be played by the multimodal device.
  - 10. The apparatus of claim 7 wherein computer program instructions for retrieving, by the prompt generation engine, from the metadata container a speech prompt related to content stored in the media file for inclusion in the multimodal application further comprise computer program instructions for identifying a tag for prompts in the metadata container.
  - 11. The apparatus of claim 10 wherein computer program instructions for identifying a tag for prompts in the metadata container further comprise computer program instructions for identifying a frame for prompts in an ID3 container of an MPEG media file.
  - 12. The apparatus of claim 7 wherein computer program instructions for modifying, by the prompt generation engine, the multimodal application to include the speech prompt further comprise computer program instructions for updating a prompt document with the retrieved speech prompt.

13. A computer program product for dynamically extending the speech prompts of a multimodal application, the computer program product including a prompt generation engine for operating on a multimodal device supporting multiple modes of user interaction with the multimodal application, the modes of user interaction including a voice mode and one or more non-voice modes, the computer program product disposed upon a computer-readable, recording medium, the computer program product comprising computer program instructions for:
- receiving, by the prompt generation engine, a media file having a metadata container;
  
  retrieving, by the prompt generation engine from the metadata container, a speech prompt related to content stored in the media file for inclusion in the multimodal application; and
  
  modifying, by the prompt generation engine, the multimodal application to include the speech prompt.
- View Dependent Claims (14, 15, 16, 17, 18)
- - 14. The computer program product of claim 13 wherein computer program instructions for retrieving, by the prompt generation engine, from the metadata container a speech prompt related to content stored in the media file for inclusion in the multimodal application further comprise computer program instructions for retrieving a text string prompt for execution by a text to speech engine.
  - 15. The computer program product of claim 13 wherein computer program instructions for retrieving, by the prompt generation engine, from the metadata container a speech prompt related to content stored in the media file for inclusion in the multimodal application further comprise computer program instructions for retrieving an audio prompt to be played by the multimodal device.
  - 16. The computer program product of claim 13 wherein computer program instructions for retrieving, by the prompt generation engine, from the metadata container a speech prompt related to content stored in the media file for inclusion in the multimodal application further comprise computer program instructions for identifying a tag for prompts in the metadata container.
  - 17. The computer program product of claim 16 wherein computer program instructions for identifying a tag for prompts in the metadata container further comprise computer program instructions for identifying a frame for prompts in an ID3 container of an MPEG media file.
  - 18. The computer program product of claim 13 wherein computer program instructions for modifying, by the prompt generation engine, the multimodal application to include the speech prompt further comprise computer program instructions for updating a prompt document with the retrieved speech prompt.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
International Business Machines Corporation
Inventors
Agapi, Ciprian, Bodin, William K., Cross, Charles W. Jr.
Primary Examiner(s)
Abebe, Daniel D

Application Number

US12/490,443
Publication Number

US 20100332234A1
Time in Patent Office

1,210 Days
Field of Search

704/270, 704/275, 715/728
US Class Current

704/275
CPC Class Codes

G10L 13/00   Speech synthesis; Text to s...

G10L 15/22   Procedures used during a sp...

H04M 2201/40   using speech recognition

H04M 3/42204   Arrangements at the exchang...

Dynamically extending the speech prompts of a multimodal application

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

Citations

18 Claims

Specification

Solutions

Use Cases

Quick Links

Dynamically extending the speech prompts of a multimodal application

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

18 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links