SYSTEM AND METHOD FOR OPEN SPEECH RECOGNITION
First Claim
1. A method of speech recognition, the method comprising:
- recognizing received speech with a plurality of domain-specific speech recognizers comprising at least two domain-specific speech recognizers from different domains, to yield respective speech recognition outputs;
determining at least one speech recognition confidence score for the respective speech recognition outputs, wherein each of the at least one speech recognition confidence score corresponds to a different segment of the respective speech recognition outputs;
selecting speech recognition candidates from segments of the speech recognition outputs based on the at least one speech recognition confidence score for the respective speech recognition outputs;
combining, via a machine-learning algorithm, the speech recognition candidates to yield a combination of the speech recognition candidates; and
generating text based on the combination.
15 Assignments
0 Petitions
Accused Products
Abstract
Disclosed herein are systems, methods and non-transitory computer-readable media for performing speech recognition across different applications or environments without model customization or prior knowledge of the domain of the received speech. The disclosure includes recognizing received speech with a collection of domain-specific speech recognizers, determining a speech recognition confidence for each of the speech recognition outputs, selecting speech recognition candidates based on a respective speech recognition confidence for each speech recognition output, and combining selected speech recognition candidates to generate text based on the combination.
-
Citations
20 Claims
-
1. A method of speech recognition, the method comprising:
-
recognizing received speech with a plurality of domain-specific speech recognizers comprising at least two domain-specific speech recognizers from different domains, to yield respective speech recognition outputs; determining at least one speech recognition confidence score for the respective speech recognition outputs, wherein each of the at least one speech recognition confidence score corresponds to a different segment of the respective speech recognition outputs; selecting speech recognition candidates from segments of the speech recognition outputs based on the at least one speech recognition confidence score for the respective speech recognition outputs; combining, via a machine-learning algorithm, the speech recognition candidates to yield a combination of the speech recognition candidates; and generating text based on the combination. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A system for open domain speech recognition, the system comprising:
-
a processor; a first module configured to control the processor to recognize received speech with a plurality of domain-specific speech recognizers comprising at least two domain-specific speech recognizers from different domains, to yield respective speech recognition outputs; a second module configured to control the processor to determine at least one speech recognition confidence score for the respective speech recognition outputs, wherein each of the at least one speech recognition confidence score corresponds to a different segment of the respective speech recognition outputs; a third module configured to control the processor to select speech recognition candidates from segments of the speech recognition outputs based on the at least one speech recognition confidence score for the respective speech recognition outputs; a fourth module configured to control the processor to combine, via a machine-learning algorithm, the speech recognition candidates to yield a combination of the speech recognition candidates; a fifth module configured to control the processor to generate text based on the combination; a sixth module configured to control the processor to collect usage statistics based on the speech recognition candidates and train parameters associated with the plurality of domain-specific speech recognizers based on the usage statistics; and a seventh module configured to control the processor to collect usage statistics based on the speech recognition candidates and train the machine-learning algorithm based on the usage statistics. - View Dependent Claims (15, 16)
-
-
17. A non-transitory computer-readable storage medium storing instructions which, when executed by a computing device, cause the computing device to perform automatic speech recognition, the instructions comprising:
-
recognizing received speech with a plurality of domain-specific speech recognizers comprising at least two domain-specific speech recognizers from different domains, to yield respective speech recognition outputs; determining at least one speech recognition confidence score for the respective speech recognition outputs, wherein each of the at least one speech recognition confidence score corresponds to a different segment of the respective speech recognition outputs; selecting speech recognition candidates from segments of the speech recognition outputs based on the at least one speech recognition confidence score for the respective speech recognition outputs; combining, via a machine-learning algorithm, the speech recognition candidates to yield a combination of the speech recognition candidates; and generating text based on the combination. - View Dependent Claims (18, 19, 20)
-
Specification