ENHANCING MEDIA PLAYBACK WITH SPEECH RECOGNITION

US 20100010814A1
Filed: 07/28/2008
Published: 01/14/2010
Est. Priority Date: 07/08/2008
Status: Active Grant

First Claim

Patent Images

1. A method for enhancing a media file to enable speech-recognition of spoken navigation commands, comprising:

receiving a plurality of textual items based on subject matter of the media file;

generating a grammar for each textual item, thereby generating a plurality of grammars for use by a speech recognition engine;

associating a time stamp with each grammar, wherein a time stamp indicates a location in the media file of a textual item corresponding with a grammar; and

associating the plurality of grammars with the media file, such that speech recognized by the speech recognition engine is associated with a corresponding location in the media file.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method for enhancing a media file to enable speech-recognition of spoken navigation commands can be provided. The method can include receiving a plurality of textual items based on subject matter of the media file and generating a grammar for each textual item, thereby generating a plurality of grammars for use by a speech recognition engine. The method can further include associating a time stamp with each grammar, wherein a time stamp indicates a location in the media file of a textual item corresponding with a grammar. The method can further include associating the plurality of grammars with the media file, such that speech recognized by the speech recognition engine is associated with a corresponding location in the media file.

Citations

18 Claims

1. A method for enhancing a media file to enable speech-recognition of spoken navigation commands, comprising:
- receiving a plurality of textual items based on subject matter of the media file;
  
  generating a grammar for each textual item, thereby generating a plurality of grammars for use by a speech recognition engine;
  
  associating a time stamp with each grammar, wherein a time stamp indicates a location in the media file of a textual item corresponding with a grammar; and
  
  associating the plurality of grammars with the media file, such that speech recognized by the speech recognition engine is associated with a corresponding location in the media file.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The method of claim 1, wherein the step of receiving a plurality of textual items further comprises:
    - receiving textual data provided by a computer via a data connection and generating a plurality of textual items based on the textual data.
  - 3. The method of claim 1, wherein the step of receiving a plurality of textual items further comprises:
    - receiving textual data provided by a user via a user input device and generating a plurality of textual items based on the textual data.
  - 4. The method of claim 3, wherein the step of receiving a plurality of textual items further comprises:
    - receiving textual data that has been organized into categories based on content of the textual items.
  - 5. The method of claim 1, further comprising:
    - storing the plurality of grammars and the media file on removable media.
  - 6. The method of claim 1, further comprising:
    - storing the plurality of grammars in a remote location on a network;
      
      embedding a link in a media file to the remote location of the plurality of grammars; and
      
      storing the media file on removable media.

7. A computer program product comprising a computer usable medium embodying computer usable program code for enhancing a media file to enable speech-recognition of spoken navigation commands, comprising:
- computer usable program code for receiving a plurality of textual items based on subject matter of the media file;
  
  computer usable program code for generating a grammar for each textual item, thereby generating a plurality of grammars for use by a speech recognition engine;
  
  computer usable program code for associating a time stamp with each grammar, wherein a time stamp indicates a location in the media file of a textual item corresponding with a grammar; and
  
  computer usable program code for associating the plurality of grammars with the media file, such that speech recognized by the speech recognition engine is associated with a corresponding location in the media file.
- View Dependent Claims (8, 9, 10, 11, 12)
- - 8. The computer program product of claim 7, wherein the computer usable program code for receiving a plurality of textual items further comprises:
    - computer usable program code for receiving textual data provided by a computer via a data connection and generating a plurality of textual items based on the textual data.
  - 9. The computer program product of claim 7, wherein the computer usable program code for receiving a plurality of textual items further comprises:
    - computer usable program code for receiving textual data provided by a user via a user input device and generating a plurality of textual items based on the textual data.
  - 10. The computer program product of claim 9, wherein the computer usable program code for receiving a plurality of textual items further comprises:
    - computer usable program code for receiving textual data that has been organized into categories based on content of the textual items.
  - 11. The computer program product of claim 7, further comprising:
    - computer usable program code for storing the plurality of grammars and the media file on removable media.
  - 12. The computer program product of claim 7, further comprising:
    - computer usable program code for storing the plurality of grammars in a remote location on a network;
      
      computer usable program code for embedding a link in a media file to the remote location of the plurality of grammars; and
      
      computer usable program code for storing the media file on removable media.

13. A computer system for enhancing a media file to enable speech-recognition of spoken navigation commands, comprising:
- a processor configured for;
  
  receiving a plurality of textual items based on subject matter of the media file; and
  
  generating a grammar for each textual item, thereby generating a plurality of grammars for use by a speech recognition engine; and
  
  a repository for storing;
  
  a grammar file including the plurality of grammars, wherein a time stamp is associated with each grammar, and wherein a time stamp indicates a location in the media file of a textual item corresponding with a grammar; and
  
  a link for associating the grammar file with the media file, such that speech recognized by the speech recognition engine is associated with a corresponding location in the media file.
- View Dependent Claims (14, 15, 16, 17, 18)
- - 14. The computer system of claim 13, wherein each textual item comprises text describing events of the media file.
  - 15. The computer system of claim 14, wherein the media file comprises a video file.
  - 16. The computer system of claim 15, wherein a time stamp includes an hour, minute and second indicator.
  - 17. The computer system of claim 13, further comprising removable media for storing the grammar file and the media file.
  - 18. The computer system of claim 13, further comprising removable media for storing the media file including a link to a remote location of the grammar file.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
International Business Machines Corporation
Inventors
Patel, Paritosh D.

Granted Patent

US 8,478,592 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/257
CPC Class Codes

G10L 15/19   Grammatical context, e.g. d...

G10L 15/22   Procedures used during a sp...

G10L 2015/228   of application context

ENHANCING MEDIA PLAYBACK WITH SPEECH RECOGNITION

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

Citations

18 Claims

Specification

Solutions

Use Cases

Quick Links

ENHANCING MEDIA PLAYBACK WITH SPEECH RECOGNITION

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

18 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links