Methods and Apparatus to Generate a Speech Recognition Library
First Claim
1. A method comprising:
- identifying a plurality of video segments having closed caption data corresponding to a phrase, the plurality of video segments associated with respective ones of a plurality of audio data segments;
computing a plurality of difference metrics between a baseline audio data segment associated with the phrase and respective ones of the plurality of audio data segments;
selecting a set of the plurality of audio data segments based on the plurality of difference metrics;
identifying a first one of the audio data segments in the set as a representative audio data segment;
determining a first phonetic transcription of the representative audio data segment; and
adding the first phonetic transcription to a speech recognition library when the first phonetic transcription differs from a second phonetic transcription associated with the phrase in the speech recognition library.
4 Assignments
0 Petitions
Accused Products
Abstract
Methods and apparatus to generate a speech recognition library for use by a speech recognition system are disclosed. An example method comprises identifying a plurality of video segments having closed caption data corresponding to a phrase, the plurality of video segments associated with respective ones of a plurality of audio data segments, computing a plurality of difference metrics between a baseline audio data segment associated with the phrase and respective ones of the plurality of audio data segments, selecting a set of the plurality of audio data segments based on the plurality of difference metrics, identifying a first one of the audio data segments in the set as a representative audio data segment, determining a first phonetic transcription of the representative audio data segment, and adding the first phonetic transcription to a speech recognition library when the first phonetic transcription differs from a second phonetic transcription associated with the phrase in the speech recognition library.
101 Citations
20 Claims
-
1. A method comprising:
-
identifying a plurality of video segments having closed caption data corresponding to a phrase, the plurality of video segments associated with respective ones of a plurality of audio data segments; computing a plurality of difference metrics between a baseline audio data segment associated with the phrase and respective ones of the plurality of audio data segments; selecting a set of the plurality of audio data segments based on the plurality of difference metrics; identifying a first one of the audio data segments in the set as a representative audio data segment; determining a first phonetic transcription of the representative audio data segment; and adding the first phonetic transcription to a speech recognition library when the first phonetic transcription differs from a second phonetic transcription associated with the phrase in the speech recognition library. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. An apparatus comprising:
-
an audio segment selector to identify a plurality of video segments having closed caption data corresponding to a phrase, the plurality of video segments associated with respective ones of a plurality of audio data segments; an audio comparator to compute a plurality of difference metrics between a baseline audio data segment associated with the phrase and respective ones of the plurality of audio data segments; an audio segment grouper to identify a set of the plurality of audio data segments based on the plurality of difference metrics; a phonetic transcriber to determine a first phonetic transcription corresponding to the set of audio data segments; and a database manager to add the first phonetic transcription to a speech recognition library when the first phonetic transcription differs from a second phonetic transcription associated with the phrase in the speech recognition library. - View Dependent Claims (9, 10)
-
-
11. An article of manufacture storing machine readable instructions which, when executed, cause a machine to:
-
identify a plurality of video segments having closed caption data corresponding to a phrase, the plurality of video segments associated with respective ones of a plurality of audio data segments; compute a plurality of difference metrics between a baseline audio data segment associated with the phrase and respective ones of the plurality of audio data segments; select a set of the plurality of audio data segments based on the plurality of difference metrics; identify a first one of the audio data segments in the set as a representative audio data segment; determine a first phonetic transcription of the representative audio data segment; and add the first phonetic transcription to a speech recognition library when the first phonetic transcription differs from a second phonetic transcription associated with the phrase in the speech recognition library. - View Dependent Claims (12, 13, 14)
-
-
15. A method comprising:
-
identifying a plurality of video segments having closed caption data corresponding to a phrase, the plurality of video segments associated with respective ones of a plurality of audio data segments; determining a plurality of phonetic transcriptions for respective ones of the plurality of audio data segments; identifying a set of the plurality of audio data segments having a first phonetic transcription different from a second phonetic transcription associated with the phrase in a speech recognition library; and adding the first phonetic transcription to the speech recognition library. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification