Speech recognition and teaching apparatus able to rapidly adapt to difficult speech of children and foreign speakers
First Claim
1. A speech recognition apparatus that adapts an initial speech model based on input speech from the user, comprising:
- a speech model that represents speech as a plurality of speech unit models associated with a plurality of speech units;
a speech recognizer that processes input speech from a user using said speech model to recognize uttered speech units within said input speech;
a confidence measurement system associated with said speech recognizer for associating a confidence measure with each of said uttered speech units;
an adaptation system having data store containing information reflecting a priori knowledge about a speaker space, said adaptation system being operative to select uttered speech units that exceed a predetermined confidence measure and to use said selected uttered speech units and said information reflecting a priori knowledge to adapt said speech model; and
wherein said adaptation system includes a data store containing a set of eigenspace basis vectors representing a plurality of training speakers and wherein said adaptation system uses said selected uttered speech units to train an adapted speech model while using said basis vectors to constrain said adapted speech model such that said adapted speech model lies within said eigenspace.
4 Assignments
0 Petitions
Accused Products
Abstract
The recognizer tests input utterances using a confidence measure to select words of high recognition confidence for use in the adaptation process. Adaptation is performed rapidly using a priori knowledge of about the class of speakers who will be using the system. This a priori knowledge can be expressed using eigenvoice basis vectors that capture information about the entire targeted user population. The dialogue system may also use the confidence measure to output a pronunciation example to the user, based on the confidence that the system has in the results of recognition, given the different possibilities that can be recognized. The dialogue system may also provide voiced prompts that teach the user how to correctly pronounce words.
-
Citations
6 Claims
-
1. A speech recognition apparatus that adapts an initial speech model based on input speech from the user, comprising:
-
a speech model that represents speech as a plurality of speech unit models associated with a plurality of speech units;
a speech recognizer that processes input speech from a user using said speech model to recognize uttered speech units within said input speech;
a confidence measurement system associated with said speech recognizer for associating a confidence measure with each of said uttered speech units;
an adaptation system having data store containing information reflecting a priori knowledge about a speaker space, said adaptation system being operative to select uttered speech units that exceed a predetermined confidence measure and to use said selected uttered speech units and said information reflecting a priori knowledge to adapt said speech model; and
wherein said adaptation system includes a data store containing a set of eigenspace basis vectors representing a plurality of training speakers and wherein said adaptation system uses said selected uttered speech units to train an adapted speech model while using said basis vectors to constrain said adapted speech model such that said adapted speech model lies within said eigenspace. - View Dependent Claims (2, 3)
-
-
4. A speech recognition apparatus that adapts an initial speech model based on input speech from the user, comprising:
-
a speech model that represents speech as a plurality of speech unit models associated with a plurality of speech units;
a speech recognizer that processes input speech from a user using said speech model to recognize uttered speech units within said input speech;
a confidence measurement system associated with said speech recognizer for associating a confidence measure with each of said uttered speech units;
an adaptation system having data store containing information reflecting a priori knowledge about a speaker space, said adaptation system being operative to select uttered speech units that exceed a predetermined confidence measure and to use said selected uttered speech units and said information reflecting a priori knowledge to adapt said speech model;
wherein said adaptation system includes a data store containing an eigenspace data structure that represents a plurality of training speakers as a set of models for said training speakers that has been dimensionally reduced to generate a set of basis vectors that define said eigenspace; and
wherein said adaptation system uses said selected uttered speech units to train an adapted speech model while using said basis vectors to constrain said adapted speech model such that said adapted speech model lies within said eigenspace. - View Dependent Claims (5, 6)
-
Specification