Audio Synchronization For Document Narration with User-Selected Playback

US 20110288861A1
Filed: 05/18/2010
Published: 11/24/2011
Est. Priority Date: 05/18/2010
Status: Active Grant

First Claim

Patent Images

1. A computer implemented method comprising:

applying speech recognition by one or more computer systems to an audio recording to generate a text version of recognized portions of text;

determining by the one or more computer systems an elapsed time period from a reference time in the audio recording to each portion of text in the recognized portions of text;

comparing by the one or more computer systems a recognized portion of text to an expected portion of text;

generating by the one or more computer systems a timing file that is stored on a computer-readable storage medium, the timing file comprising the elapsed time information for each expected portion of text;

receiving from a user an indication of a user-selected portion of text;

determining by the one or more computers an elapsed time in the audio recording by referencing the timing file associated with the user-selected portion of text; and

providing an audible output corresponding the audio in the audio recording at the determined elapsed time in the audio recording.

View all claims

8 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Disclosed are techniques and systems to provide a narration of a text. In some aspects, the techniques and systems described herein include generating a timing file that includes elapsed time information for expected portions of text that provides an elapsed time period from a reference time in an audio recording to each portion of text in recognized portions of text

Citations

15 Claims

1. A computer implemented method comprising:
- applying speech recognition by one or more computer systems to an audio recording to generate a text version of recognized portions of text;
  
  determining by the one or more computer systems an elapsed time period from a reference time in the audio recording to each portion of text in the recognized portions of text;
  
  comparing by the one or more computer systems a recognized portion of text to an expected portion of text;
  
  generating by the one or more computer systems a timing file that is stored on a computer-readable storage medium, the timing file comprising the elapsed time information for each expected portion of text;
  
  receiving from a user an indication of a user-selected portion of text;
  
  determining by the one or more computers an elapsed time in the audio recording by referencing the timing file associated with the user-selected portion of text; and
  
  providing an audible output corresponding the audio in the audio recording at the determined elapsed time in the audio recording.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
- - 2. The method of claim 1 wherein generating the timing file comprises:
    - storing the elapsed time information for a recognized portion of text in the timing file if the recognized portion of text matches the corresponding expected portion of text; and
      
      computing elapsed time information for an expected portion of text and storing the computed elapsed time information into the timing file if the recognized portion of text does not match the corresponding expected portion of text.
  - 3. The method of claim 2 wherein the recognized portions or expected portions of text comprise words.
  - 4. The method of claim 3 wherein computing the elapsed time information further comprises:
    - determining an elapsed time period for each expected text portion, further comprising;
      
      determining by one or more computer systems the number of syllables or phonemes in an expected word that is part of the expected portion of text;
      
      determining by the one or more computer systems the corresponding recognized portion that is associated with that same number of syllables or phonemes in the expected word;
      
      determining by the one or more computer systems an elapsed time for the corresponding recognized portion, andstoring the determined elapsed time to a timing file that is stored on a computer-readable storage medium.
  - 5. The method of claim 2 wherein computing further comprises:
    - determining the elapsed time for an expected portion of text based on a metric associated with an expected length of time to speak the expected portion of text.
  - 6. The method of claim 1 wherein providing an audible output comprises providing audio beginning with a first word in the user-selected portion of text and continuing until the end of the document.
  - 7. The method of claim 1 wherein providing an audible output comprises providing audio corresponding to the user-selected portion of text.
  - 8. The method of claim 7, further comprising ceasing providing the audio output upon reaching a last word in the user-selected portion of text.

9. A computer implemented method comprising:
- applying speech recognition by one or more computer systems to an audio recording to generate a text version of recognized portions of text;
  
  providing an audible output corresponding to the audio recording;
  
  displaying, on a user interface rendered on a display device, an expected portion of text that corresponds to the words in the audio recording, the displayed expected portion of text including at least a portion of the expected portion of text that is currently being provided on the audible output; and
  
  providing visual indicia for the displayed text that corresponds to;
  
  the audio that is currently being provided on the audible output, if the recognized portion of text matches the corresponding expected portion of text; and
  
  otherwiseone or more portions of text which does not match the recognized portion of text, if the recognized portion of text does not match the corresponding expected portion of text.

10. A computer implemented method comprising:
- applying speech recognition by one or more computer systems to an audio recording to generate a text version of recognized portions of text;
  
  comparing by the one or more computer systems the recognized portion of text to an expected portion of text;
  
  providing an audible output corresponding to the audio recording;
  
  determining by the one or more computer systems a recognized portion of text corresponding to a currently audible portion of the audio recording;
  
  displaying an expected portion of text on a user interface rendered on a display device such that the displayed expected portion of text includes at least an expected portion of text previous to the determined currently audible portion of the audio recording;
  
  providing visual indicia for the displayed expected portion of text that corresponds to the expected text portion that is previous to the currently audible portion of the audio recording, if the recognized portion of text is in addition to and not included in the expected portion text.

11. A computer implemented method comprising:
- applying speech recognition by one or more computer systems to an audio recording to generate a text version of recognized words;
  
  determining the linguistic units of one or more recognized words;
  
  computing a timing for each determined linguistic unit;
  
  determining the linguistic units of one or more words in an expected portion of text;
  
  associating linguistic units in the one or more words in the expected portion of text with linguistic units in the recognized words; and
  
  computing a timing for one or more linguistic units in the one or more words in the expected portion of text based on the timing of one or more corresponding determined linguistic units of the one or more recognized words.
- View Dependent Claims (12, 13)
- - 12. The method of claim 11 wherein one or both of determining the linguistic units of the one or more recognized words and determining the linguistic units of the one or more words in the expected portion of text comprises referencing information associated with the linguistic units of words.
  - 13. The method of claim 11 wherein computing the timing for the one or more of linguistic units of the one or more recognized words comprises:
    - referencing information associated with the relative timing of linguistic units and using the determined elapsed time for each recognized word.

14. A computer program product residing on a computer readable medium, the computer program product comprising instructions for causing a processor to:
- apply speech recognition to an audio recording to generate a text version of recognized portions of text;
  
  determine an elapsed time period from a reference time in the audio recording to each portion of text in the recognized portions of text;
  
  compare a recognized portion of text to an expected portion of text;
  
  generate a timing file that is stored on a computer-readable storage medium, the timing file comprising the elapsed time information for each expected portion of text;
  
  receive an indication of a user-selected portion of text;
  
  determine an elapsed time in the audio recording by referencing the timing file associated with the user-selected portion of text; and
  
  provide an audible output corresponding the audio in the audio recording at the determined elapsed time in the audio recording.

15. A system comprising:
- a memory; and
  
  a computing device configured to;
  
  apply speech recognition to an audio recording to generate a text version of recognized portions of text;
  
  determine an elapsed time period from a reference time in the audio recording to each portion of text in the recognized portions of text;
  
  compare a recognized portion of text to an expected portion of text;
  
  generate a timing file that is stored on a computer-readable storage medium, the timing file comprising the elapsed time information for each expected portion of text;
  
  receive an indication of a user-selected portion of text;
  
  determine an elapsed time in the audio recording by referencing the timing file associated with the user-selected portion of text; and
  
  provide an audible output corresponding the audio in the audio recording at the determined elapsed time in the audio recording.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
T Play Holdings LLC (FIG LLC (d/b/a Fortress Investment Group LLC))
Original Assignee
K-NFB Reading Technology Inc.
Inventors
Chapman, Peter, Gibson, Lucy, Albrecht, Paul, Kurzweil, Raymond C.

Granted Patent

US 8,392,186 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/235
CPC Class Codes

G09B 5/06   with both visual and audibl...

G09B 5/062   Combinations of audio and p...

G10L 15/26   Speech to text systems G10L...

Audio Synchronization For Document Narration with User-Selected Playback

First Claim

8 Assignments

0 Petitions

Accused Products

Abstract

Citations

15 Claims

Specification

Solutions

Use Cases

Quick Links

Audio Synchronization For Document Narration with User-Selected Playback

First Claim

8 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

15 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links