Audio Synchronization For Document Narration with User-Selected Playback
First Claim
Patent Images
1. A computer implemented method comprising:
- applying speech recognition by one or more computer systems to an audio recording to generate a text version of recognized portions of text;
determining by the one or more computer systems an elapsed time period from a reference time in the audio recording to each portion of text in the recognized portions of text;
comparing by the one or more computer systems a recognized portion of text to an expected portion of text;
generating by the one or more computer systems a timing file that is stored on a computer-readable storage medium, the timing file comprising the elapsed time information for each expected portion of text;
receiving from a user an indication of a user-selected portion of text;
determining by the one or more computers an elapsed time in the audio recording by referencing the timing file associated with the user-selected portion of text; and
providing an audible output corresponding the audio in the audio recording at the determined elapsed time in the audio recording.
8 Assignments
0 Petitions
Accused Products
Abstract
Disclosed are techniques and systems to provide a narration of a text. In some aspects, the techniques and systems described herein include generating a timing file that includes elapsed time information for expected portions of text that provides an elapsed time period from a reference time in an audio recording to each portion of text in recognized portions of text
-
Citations
15 Claims
-
1. A computer implemented method comprising:
-
applying speech recognition by one or more computer systems to an audio recording to generate a text version of recognized portions of text; determining by the one or more computer systems an elapsed time period from a reference time in the audio recording to each portion of text in the recognized portions of text; comparing by the one or more computer systems a recognized portion of text to an expected portion of text; generating by the one or more computer systems a timing file that is stored on a computer-readable storage medium, the timing file comprising the elapsed time information for each expected portion of text; receiving from a user an indication of a user-selected portion of text; determining by the one or more computers an elapsed time in the audio recording by referencing the timing file associated with the user-selected portion of text; and providing an audible output corresponding the audio in the audio recording at the determined elapsed time in the audio recording. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A computer implemented method comprising:
-
applying speech recognition by one or more computer systems to an audio recording to generate a text version of recognized portions of text; providing an audible output corresponding to the audio recording; displaying, on a user interface rendered on a display device, an expected portion of text that corresponds to the words in the audio recording, the displayed expected portion of text including at least a portion of the expected portion of text that is currently being provided on the audible output; and providing visual indicia for the displayed text that corresponds to; the audio that is currently being provided on the audible output, if the recognized portion of text matches the corresponding expected portion of text; and
otherwiseone or more portions of text which does not match the recognized portion of text, if the recognized portion of text does not match the corresponding expected portion of text.
-
-
10. A computer implemented method comprising:
-
applying speech recognition by one or more computer systems to an audio recording to generate a text version of recognized portions of text; comparing by the one or more computer systems the recognized portion of text to an expected portion of text; providing an audible output corresponding to the audio recording; determining by the one or more computer systems a recognized portion of text corresponding to a currently audible portion of the audio recording; displaying an expected portion of text on a user interface rendered on a display device such that the displayed expected portion of text includes at least an expected portion of text previous to the determined currently audible portion of the audio recording; providing visual indicia for the displayed expected portion of text that corresponds to the expected text portion that is previous to the currently audible portion of the audio recording, if the recognized portion of text is in addition to and not included in the expected portion text.
-
-
11. A computer implemented method comprising:
-
applying speech recognition by one or more computer systems to an audio recording to generate a text version of recognized words; determining the linguistic units of one or more recognized words; computing a timing for each determined linguistic unit; determining the linguistic units of one or more words in an expected portion of text; associating linguistic units in the one or more words in the expected portion of text with linguistic units in the recognized words; and computing a timing for one or more linguistic units in the one or more words in the expected portion of text based on the timing of one or more corresponding determined linguistic units of the one or more recognized words. - View Dependent Claims (12, 13)
-
-
14. A computer program product residing on a computer readable medium, the computer program product comprising instructions for causing a processor to:
-
apply speech recognition to an audio recording to generate a text version of recognized portions of text; determine an elapsed time period from a reference time in the audio recording to each portion of text in the recognized portions of text; compare a recognized portion of text to an expected portion of text; generate a timing file that is stored on a computer-readable storage medium, the timing file comprising the elapsed time information for each expected portion of text; receive an indication of a user-selected portion of text; determine an elapsed time in the audio recording by referencing the timing file associated with the user-selected portion of text; and provide an audible output corresponding the audio in the audio recording at the determined elapsed time in the audio recording.
-
-
15. A system comprising:
-
a memory; and a computing device configured to; apply speech recognition to an audio recording to generate a text version of recognized portions of text; determine an elapsed time period from a reference time in the audio recording to each portion of text in the recognized portions of text; compare a recognized portion of text to an expected portion of text; generate a timing file that is stored on a computer-readable storage medium, the timing file comprising the elapsed time information for each expected portion of text; receive an indication of a user-selected portion of text; determine an elapsed time in the audio recording by referencing the timing file associated with the user-selected portion of text; and provide an audible output corresponding the audio in the audio recording at the determined elapsed time in the audio recording.
-
Specification