Semi-discrete utterance recognizer for carefully articulated speech
First Claim
1. A method for performing speech recognition of a user'"'"'s speech, comprising:
- performing a first speech recognition process on each utterance of the user'"'"'s speech, using acoustic models that are based on training data of non-discrete utterances;
performing a second speech recognition process on each utterance of the user'"'"'s speech, using acoustic models that are based on training data of discrete utterances;
obtaining a first match score for each utterance of the user'"'"'s speech from the first speech recognition process and obtaining a second match score for each utterance of the user'"'"'s speech from the second speech recognition process, determining a highest match score from the first and second match scores; and
providing a speech recognition output for the user'"'"'s speech, based on highest match scores of each utterance as obtained from the first and second speech recognition processes.
1 Assignment
0 Petitions
Accused Products
Abstract
A method for performing speech recognition of a user'"'"'s speech includes performing a first speech recognition process on each utterance of the user'"'"'s speech, using acoustic models that are based on training data of non-discrete utterances. The method also includes performing a second speech recognition process on each utterance of the user'"'"'s speech, using acoustic models that are based on training data of discrete utterances. The method further includes obtaining a first match score for each utterance of the user'"'"'s speech from the first speech recognition process and obtaining a second match score for each utterance of the user'"'"'s speech from the second speech recognition process. The method also includes determining a highest match score from the first and second match scores. The method further includes providing a speech recognition output for the user'"'"'s speech, based on highest match scores of each utterance as obtained from the first and second speech recognition processes.
68 Citations
17 Claims
-
1. A method for performing speech recognition of a user'"'"'s speech, comprising:
-
performing a first speech recognition process on each utterance of the user'"'"'s speech, using acoustic models that are based on training data of non-discrete utterances;
performing a second speech recognition process on each utterance of the user'"'"'s speech, using acoustic models that are based on training data of discrete utterances;
obtaining a first match score for each utterance of the user'"'"'s speech from the first speech recognition process and obtaining a second match score for each utterance of the user'"'"'s speech from the second speech recognition process, determining a highest match score from the first and second match scores; and
providing a speech recognition output for the user'"'"'s speech, based on highest match scores of each utterance as obtained from the first and second speech recognition processes. - View Dependent Claims (2, 3)
-
-
4. A method for performing speech recognition of a user'"'"'s speech;
- comprising;
performing a first speech recognition process on the user'"'"'s speech in a first mode of operation, using acoustic models that are based on training data of non-discrete utterances;
performing a second speech recognition process on the user'"'"'s speech in a second mode of operation, using acoustic models that are based on training data of discrete utterances, and providing a speech recognition output for the user'"'"'s speech, based on respective outputs from the first and second speech recognition processes, wherein only one of the first and second speech recognition processes is capable of being operative at any particular moment in time. - View Dependent Claims (5, 6, 8)
- comprising;
-
7. A system for performing speech recognition of a user'"'"'s speech;
- comprising;
a control unit for receiving the user'"'"'s speech and for determining whether or not an error correction mode is to be initiated based on utterances made in the user'"'"'s speech, and to output a control signal indicative of whether or not the error correction mode is in operation;
a first speech recognition unit configured to receive the user'"'"'s speech and to perform a first speech recognition processing on the user'"'"'s speech when the control signal provided by the control unit indicates that the error correction mode is not in operation; and
a second speech recognition unit configured to receive the user'"'"'s speech and to perform a second speech recognition processing on the user'"'"'s speech when the control signal provided by the control unit indicates that the error correction mode is in operation;
wherein the second speech recognition unit utilizes training data of speech that is spoken in a slower word rate than training data of speech used by the first speech recognition unit.
- comprising;
-
9. A system for performing speech recognition of a user'"'"'s speech;
- comprising;
a first speech recognition unit configured to receive the user'"'"'s speech and to perform a first speech recognition processing on the user'"'"'s speech based in part on training data of speech spoken at a first speech rate or higher, the first speech recognition unit outputting a first match score for each utterance of the user'"'"'s speech;
a second speech recognition unit configured to receive the user'"'"'s speech and to perform a first speech recognition processing on the user'"'"'s speech based in part on training data of speech spoken at a speech rate lower than the first speech rate, the second speech recognition unit outputting a second match score for each utterance of the user'"'"'s speech; and
a comparison unit configured to receive the first and second match scores and to determine, for each utterance of the user'"'"'s speech, which of the first and second match scores is highest, wherein a speech recognition output corresponds to a highest match score for each utterance of the user'"'"'s speech, as output from the comparison unit. - View Dependent Claims (10)
- comprising;
-
11. A program product having machine readable code for performing speech recognition of a user'"'"'s speech, the program code, when executed, causing a machine to perform the following steps:
-
performing a first speech recognition process on each utterance of the user'"'"'s speech, using acoustic models that are based on training data of non-discrete utterances;
performing a second speech recognition process on each utterance of the user'"'"'s speech, using acoustic models that are based on training data of discrete utterances;
obtaining a first match score for each utterance of the user'"'"'s speech from the first speech recognition process and obtaining a second match score for each utterance of the user'"'"'s speech from the second speech recognition process, determining a highest match score from the first and second match scores; and
providing a speech recognition output for the user'"'"'s speech, based on highest match scores of each utterance as obtained from the first and second speech recognition processes. - View Dependent Claims (12, 13)
-
-
14. A program product for performing speech recognition of a user'"'"'s speech;
- comprising;
performing a first speech recognition process on the user'"'"'s speech in a first mode of operation, using acoustic models that are based on training data of non-discrete utterances;
performing a second speech recognition process on the user'"'"'s speech in a second mode of operation, using acoustic models that are based on training data of discrete utterances, and providing a speech recognition output for the user'"'"'s speech, based on respective outputs from the first and second speech recognition processes, wherein only one of the first and second speech recognition processes is capable of being operative at any particular moment in time. - View Dependent Claims (15, 16, 17)
- comprising;
Specification