Audio synchronization for document narration with user-selected playback

US 8,392,186 B2
Filed: 05/18/2010
Issued: 03/05/2013
Est. Priority Date: 05/18/2010
Status: Expired due to Fees

First Claim

Patent Images

1. A computer implemented method comprising:

applying speech recognition by one or more computer systems to an audio recording to generate a text version of recognized portions of text;

determining by the one or more computer systems an elapsed time period from a reference time in the audio recording to each portion of text in the recognized portions of text;

comparing by the one or more computer systems a recognized portion of text to an expected portion of text;

determining by the one or more computer systems a number of syllables or phonemes in a sequence of expected words that are part of the expected portion of text;

determining by the one or more computer systems a corresponding recognized portion comprising a sequence of recognized words, the sequence of expected words and sequence of recognized words having a same number of syllables or phonemes and a different number of words;

determining by the one or more computer systems an elapsed time for the corresponding recognized portion;

storing the determined elapsed time in a timing file that is stored on a computer-readable storage device, the timing file further comprising the elapsed time information for each expected portion of text;

receiving from a user an indication of a user-selected portion of text;

determining by the one or more computers an elapsed time in the audio recording by referencing the timing file associated with the user-selected portion of text; and

providing an audible output corresponding to the audio in the audio recording at the determined elapsed time in the audio recording.

View all claims

8 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Disclosed are techniques and systems to provide a narration of a text. In some aspects, the techniques and systems described herein include generating a timing file that includes elapsed time information for expected portions of text that provides an elapsed time period from a reference time in an audio recording to each portion of text in recognized portions of text.

16 Citations

View as Search Results

18 Claims

1. A computer implemented method comprising:
- applying speech recognition by one or more computer systems to an audio recording to generate a text version of recognized portions of text;
  
  determining by the one or more computer systems an elapsed time period from a reference time in the audio recording to each portion of text in the recognized portions of text;
  
  comparing by the one or more computer systems a recognized portion of text to an expected portion of text;
  
  determining by the one or more computer systems a number of syllables or phonemes in a sequence of expected words that are part of the expected portion of text;
  
  determining by the one or more computer systems a corresponding recognized portion comprising a sequence of recognized words, the sequence of expected words and sequence of recognized words having a same number of syllables or phonemes and a different number of words;
  
  determining by the one or more computer systems an elapsed time for the corresponding recognized portion;
  
  storing the determined elapsed time in a timing file that is stored on a computer-readable storage device, the timing file further comprising the elapsed time information for each expected portion of text;
  
  receiving from a user an indication of a user-selected portion of text;
  
  determining by the one or more computers an elapsed time in the audio recording by referencing the timing file associated with the user-selected portion of text; and
  
  providing an audible output corresponding to the audio in the audio recording at the determined elapsed time in the audio recording.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The method of claim 1 wherein generating the timing file comprises:
    - storing the elapsed time information for a recognized portion of text in the timing file if the recognized portion of text matches the corresponding expected portion of text; and
      
      storing the determined elapsed time information for recognized words having the same number of syllables or phonemes and the different number of words into the timing file when the recognized portion of text does not match the corresponding expected portion of text.
  - 3. The method of claim 2 wherein the recognized portions or expected portions of text comprise words.
  - 4. The method of claim 2 wherein determining the elapsed time for the corresponding recognized portion further comprises:
    - determining the elapsed time for an expected portion of text based on a metric associated with an expected length of time to speak the expected portion of text.
  - 5. The method of claim 1 wherein providing an audible output comprises providing audio beginning with a first word in the user-selected portion of text and continuing until the end of the document.
  - 6. The method of claim 1 wherein providing an audible output comprises providing audio corresponding to the user-selected portion of text.
  - 7. The method of claim 6, further comprising ceasing providing the audio output upon reaching a last word in the user-selected portion of text.

8. A computer program product tangibly stored on a computer readable storage device, the computer program product comprising instructions for causing a processor to:
- apply speech recognition to an audio recording to generate a text version of recognized portions of text;
  
  determine an elapsed time period from a reference time in the audio recording to each portion of text in the recognized portions of text;
  
  compare a recognized portion of text to an expected portion of text;
  
  determine a number of syllables or phonemes in a sequence of expected words that are part of the expected portion of text;
  
  determine a corresponding recognized portion comprising a sequence of recognized words, the sequence of expected words and sequence of recognized words having a same number of syllables or phonemes and a different number of words;
  
  determine an elapsed time for the corresponding recognized portion;
  
  store the determined elapsed time in a timing file that is stored on a computer-readable storage device, the timing file further comprising the elapsed time information for each expected portion of text;
  
  receive an indication of a user-selected portion of text;
  
  determine an elapsed time in the audio recording by referencing the timing file associated with the user-selected portion of text; and
  
  provide an audible output corresponding the audio in the audio recording at the determined elapsed time in the audio recording.
- View Dependent Claims (9, 10, 11, 12, 13)
- - 9. The computer program product of claim 8 wherein instructions to generate the timing file comprises instructions to:
    - store the elapsed time information for a recognized portion of text in the timing file if the recognized portion of text matches the corresponding expected portion of text; and
      
      store the determined elapsed time information for recognized words having the same number of syllables or phonemes and the different number of words into the timing file when the recognized portion of text does not match the corresponding expected portion of text.
  - 10. The computer program product of claim 9 wherein instructions to determine the elapsed time for the corresponding recognized portion further comprises instructions to:
    - determine the elapsed time for an expected portion of text based on a metric associated with an expected length of time to speak the expected portion of text.
  - 11. The computer program product of claim 9 wherein instructions to provide an audible output comprises instructions to provide audio corresponding to the user-selected portion of text.
  - 12. The computer program product of claim 9 wherein instructions to provide an audible output comprises instructions to cease providing the audio output upon reaching a last word in the user-selected portion of text.
  - 13. The computer program product of claim 8 further comprising instructions to:
    - determine the elapsed time for an expected portion of text based on a metric associated with an expected length of time to speak the expected portion of text.

14. A system comprising:
- a memory; and
  
  a computing device configured to;
  
  apply speech recognition to an audio recording to generate a text version of recognized portions of text;
  
  determine an elapsed time period from a reference time in the audio recording to each portion of text in the recognized portions of text;
  
  compare a recognized portion of text to an expected portion of text;
  
  determine a number of syllables or phonemes in a sequence of expected words that are part of the expected portion of text;
  
  determine a corresponding recognized portion comprising a sequence of recognized words, the sequence of expected words and sequence of recognized words having a same number of syllables or phonemes and a different number of words;
  
  determine an elapsed time for the corresponding recognized portion;
  
  store the determined elapsed time in a timing file that is stored on a computer-readable storage device, the timing file further comprising the elapsed time information for each expected portion of text;
  
  receive an indication of a user-selected portion of text;
  
  determine an elapsed time in the audio recording by referencing the timing file associated with the user-selected portion of text; and
  
  provide an audible output corresponding the audio in the audio recording at the determined elapsed time in the audio recording.
- View Dependent Claims (15, 16, 17, 18)
- - 15. The system of claim 14 wherein the computing device is configured to:
    - store the elapsed time information for a recognized portion of text in the timing file if the recognized portion of text matches the corresponding expected portion of text; and
      
      compute elapsed time information for an expected portion of text and store the determined elapsed time information for recognized words having the same number of syllables or phonemes and the different number of words into the timing file when the recognized portion of text does not match the corresponding expected portion of text.
  - 16. The system of claim 15 wherein the computing device configured to determine the elapsed time for the corresponding recognized portion is further configured to:
    - determine the elapsed time for an expected portion of text based on a metric associated with an expected length of time to speak the expected portion of text.
  - 17. The system of claim 14 wherein the computing device is further configured to provide audio corresponding to the user-selected portion of text.
  - 18. The system of claim 14 wherein the computing device is further configured to cease providing the audio output upon reaching a last word in the user-selected portion of text.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
T Play Holdings LLC (SoftBank Group Corp.)
Original Assignee
K-NFB Reading Technology Inc.
Inventors
Kurzweil, Raymond C., Albrecht, Paul, Chapman, Peter, Gibson, Lucy
Primary Examiner(s)
Pullias, Jessee

Application Number

US12/781,977
Publication Number

US 20110288861A1
Time in Patent Office

1,022 Days
Field of Search

704/231, 704/235, 704/251, 704/270, 704/271, 704/276
US Class Current

704/235
CPC Class Codes

G09B 5/06   with both visual and audibl...

G09B 5/062   Combinations of audio and p...

G10L 15/26   Speech to text systems G10L...

Audio synchronization for document narration with user-selected playback

First Claim

8 Assignments

0 Petitions

Accused Products

Abstract

16 Citations

18 Claims

Specification

Solutions

Use Cases

Quick Links

Audio synchronization for document narration with user-selected playback

First Claim

8 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

16 Citations

18 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links