SEQUENTIAL SPEECH RECOGNITION WITH TWO UNEQUAL ASR SYSTEMS
First Claim
Patent Images
1. A method for providing efficient speech recognition, the method comprising:
- providing a first plurality of vocabulary data;
providing a second plurality of vocabulary data;
adding at least one decoy entry to the first plurality of vocabulary data wherein the at least one decoy entry comprises at least one entry from the second plurality of vocabulary data;
receiving an input comprising an audio signal;
determining whether the input matches at least one entry in the first plurality of vocabulary data;
in response to determining that the input matches the at least one entry in the first vocabulary data, determining whether the matched at least one entry comprises the at least one decoy entry in the first vocabulary data; and
in response to determining that the matched at least one entry comprises the at least one decoy entry in the first vocabulary data, determining whether the input matches at least one entry in the second plurality of vocabulary data.
2 Assignments
0 Petitions
Accused Products
Abstract
Sequential speech recognition using two unequal automatic speech recognition (ASR) systems may be provided. The system may provide two sets of vocabulary data. A determination may be made as to whether entries in one set of vocabulary data are likely to be confused with entries in the other set of vocabulary data. If confusion is likely, a decoy entry from one set of the vocabulary data may be placed in the other set of vocabulary data to ensure more efficient and accurate speech recognition processing may take place.
121 Citations
20 Claims
-
1. A method for providing efficient speech recognition, the method comprising:
-
providing a first plurality of vocabulary data; providing a second plurality of vocabulary data; adding at least one decoy entry to the first plurality of vocabulary data wherein the at least one decoy entry comprises at least one entry from the second plurality of vocabulary data; receiving an input comprising an audio signal; determining whether the input matches at least one entry in the first plurality of vocabulary data; in response to determining that the input matches the at least one entry in the first vocabulary data, determining whether the matched at least one entry comprises the at least one decoy entry in the first vocabulary data; and in response to determining that the matched at least one entry comprises the at least one decoy entry in the first vocabulary data, determining whether the input matches at least one entry in the second plurality of vocabulary data. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
-
-
16. A system for providing efficient speech recognition, the system comprising:
-
a memory storage; and a processing unit coupled to the memory storage, wherein the processing unit is operative to; access a first plurality of vocabulary data; access a second plurality of vocabulary data; add at least one decoy entry to the first plurality of vocabulary data wherein the at least one decoy entry comprises at least one entry from the second plurality of vocabulary data, wherein being operative to add the at least one decoy entry to the first plurality of vocabulary data comprises being operative to; compare each entry in the first plurality of vocabulary data to each entry in the second plurality of vocabulary data, determine whether at least one entry in the first plurality of vocabulary data is confusable with at least one entry in the second plurality of vocabulary data, in response to determining that at least one entry in the first plurality of vocabulary data is confusable with at least one entry in the second plurality of vocabulary data, add the at least one entry in the second plurality of vocabulary data that is confusable with the at least one entry in the first plurality of vocabulary data to the first plurality of vocabulary data, and associate the added entry to the first plurality of vocabulary data with the at least one entry in the first plurality of vocabulary data as a decoy entry; receive an input comprising a speech signal; determine whether the input matches at least one entry in the first plurality of vocabulary data; in response to determining that the input matches the at least one entry in the first vocabulary data, determine whether the matched at least one entry comprises the at least one decoy entry in the first vocabulary data; and in response to determining that the matched at least one entry comprises the at least one decoy entry in the first vocabulary data, determine whether the input matches at least one entry in the second plurality of vocabulary data. - View Dependent Claims (17, 18, 19)
-
-
20. A computer-readable medium which stores a set of instructions which, when executed, performs a method for providing efficient speech recognition, the method executed by the set of instructions comprising:
-
providing a first plurality of vocabulary data; providing a second plurality of vocabulary data, wherein the second plurality of vocabulary data comprises a larger number of entries than the first plurality of vocabulary data; adding at least one decoy entry to the first plurality of vocabulary data wherein the at least one decoy entry comprises at least one entry from the second plurality of vocabulary data, wherein adding at least one decoy entry to the first plurality of vocabulary data comprises; comparing each entry in the first plurality of vocabulary data to each entry in the second plurality of vocabulary data, determining whether at least one entry in the first plurality of vocabulary data is confusable with at least one entry in the second plurality of vocabulary data, wherein determining whether at least one entry in the first plurality of vocabulary data is confusable with at least one entry in the second plurality of vocabulary data comprises; calculating a confusion score based on a comparison of each entry in the first plurality of vocabulary data with each entry in the second plurality of vocabulary data, and determining whether the calculated confusion score comprises a value indicating that at least one entry in the first plurality of vocabulary data is confusable with at least one entry in the second plurality of vocabulary data; in response to determining that at least one entry in the first plurality of vocabulary data is confusable with at least one entry in the second plurality of vocabulary data, adding the at least one entry in the second plurality of vocabulary data that is confusable with the at least one entry in the first plurality of vocabulary data to the first plurality of vocabulary data according to at least one of the following;
adding all of the at least one entries in the second plurality of vocabulary data to the first plurality of vocabulary data comprising a confusion score greater than a confusion threshold and adding a predefined number of the at least one entries in the second plurality of vocabulary data to the first plurality of vocabulary data comprising the highest confusion scores, andassociating the added entry to the first plurality of vocabulary data with the at least one entry in the first plurality of vocabulary data as a decoy entry; receiving an input comprising an audible signal; determining whether the input matches at least one entry in the first plurality of vocabulary data, wherein determining whether the input matches at least one entry in the first plurality of vocabulary data comprises; assigning a recognition score based on a comparison of the input with each entry in the first plurality of vocabulary data, converting the recognition score associated with each entry in the first plurality of vocabulary data to a posterior probability, computing a confidence score as a difference between the posterior probability of the input with the highest recognition score and the input with the next highest recognition score, and determining whether the confidence score exceeds a confidence threshold associated with the entry comprising the highest recognition score in the first plurality of vocabulary data; in response to determining that the input matches the at least one entry in the first vocabulary data, determining whether the matched entry comprises the at least one decoy entry in the first vocabulary data; in response to determining that the matched entry comprises the at least one decoy entry in the first vocabulary data, determining whether the input matches at least one entry in the second plurality of vocabulary data; and in response to determining that the input does not match at least one entry in the first plurality of vocabulary data, determining whether the input matches at least one entry in the second plurality of vocabulary data.
-
Specification