Generating a task-adapted acoustic model from one or more supervised and/or unsupervised corpora
First Claim
1. A method of generating an acoustic model (AM) for use in a speech recognition system, comprising:
- receiving unsupervised speech data as utterances formed of sub-utterances units, the utterances being represented by acoustic data and including words;
generatinq a transcription for each utterance in the unsupervised speech data and a confidence measure for each sub-utterance unit, with a speech recognizer;
generating a confidence measure weighted AM based on the acoustic data and the transcriptions weighted by the confidence measures on a sub-utterance level wherein generates the confidence measure for each word in the utterance; and
combining the confidence measure weight AM with a supervised AM generated from supervised speech data to obtain a composite AM.
2 Assignments
0 Petitions
Accused Products
Abstract
Unsupervised speech data is provided to a speech recognizer that recognizes the speech data and outputs a recognition result along with a confidence measure for each recognized word. A task-related acoustic model is generated based on the recognition result, the speech data and the confidence measure. Additional task independent model can be used. The speech data can be weighted by the confidence measure in generating the acoustic model so that only data that has been recognized with a high degree of confidence will weigh heavily in generation of the acoustic model. The acoustic model can be formed from a Gaussian mean and variance of the data.
-
Citations
22 Claims
-
1. A method of generating an acoustic model (AM) for use in a speech recognition system, comprising:
-
receiving unsupervised speech data as utterances formed of sub-utterances units, the utterances being represented by acoustic data and including words; generatinq a transcription for each utterance in the unsupervised speech data and a confidence measure for each sub-utterance unit, with a speech recognizer; generating a confidence measure weighted AM based on the acoustic data and the transcriptions weighted by the confidence measures on a sub-utterance level wherein generates the confidence measure for each word in the utterance; and combining the confidence measure weight AM with a supervised AM generated from supervised speech data to obtain a composite AM. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A method of generating an acoustic model (AM) for use in a speech recognition system, comprising:
-
receiving unsupervised speech data as utterances formed of sub-utterances units, the utterances being represented by acoustic data; generating a transcription for each utterance in the unsupervised speech data and a confidence measure for each sub-utterance unit, with a speech recognizer; generating a confidence measure weighted AM based on the acoustic data and the transcriptions weighted by the confidence measure on a sub-utterance level; and wherein the unsupervised speech data comprises task-independent speech data including words, and further comprising; generating a relevance measure for each word in the task-independent data, the relevance measure being indicative of a relevance of the word to a desired task to be performed by the speech recognition system. - View Dependent Claims (10)
-
-
11. A method of generating an acoustic model (AM) for a speech recognition system, comprising:
-
receiving a task-dependent (TD) AM generated from task-dependent speech data, relevant to a desired task to be performed by the speech recognition system; receiving a task-independent (TI) AM generated from task-independent speech data, the TI AM and the TD AM each including Gaussian means and variances; and combining the Gaussian means and variances based on an amount of data used to generate each mean and each variance to obtain a composite AM. - View Dependent Claims (12, 13, 14)
-
-
15. An acoustic model (AM) generation system, comprising:
-
a speech recognizer receiving unsupervised speech data in the form of utterances with sub-utterance units and words and generating a transcription of the utterances and a confidence measure associated with each sub-utterance unit; an AM generator receiving the transcription and confidence measures and generating a confidence measure AM by weiqhting each word in the utterances with a confidence measure; and a task relevance AM generator receiving supervised task-relevance (TI) speech data including words and generating a task relevance AM based on a relevance of each word in the TI speech data to a desired task for the task relevance AM. - View Dependent Claims (16, 17, 18, 19, 20)
-
-
21. An acoustic model (AM) generation system, comprising:
-
a speech recognizer receiving unsupervised speech data in the form of utterances with sub-utterance units and words generating a transcription of the utterances and a confidence measure associated with each sub-utterance unit; an AM generator receiving the transcription and confidence measures and generating a confidence measure AM by weighting each word in the utterances with a confidence measure; wherein the unsupervised speech data comprises a task-independent (TI) data including words a relevance generator generating a relevance measure for each word in the TI data, the relevance for each word in the TI data, the relevance measure being indicative of a relevance of the word in the TI data to a desired task for the AM. - View Dependent Claims (22)
-
Specification