Speech recognition using domain knowledge
First Claim
Patent Images
1. A method of performing speech recognition that is performed by one or more computers of an automated speech recognizer, the method comprising:
- receiving, by the one or more computers, data that indicates multiple candidate transcriptions for an utterance, wherein the one or more computers are in communication with (i) a first search system that provides a search service of a first domain, and (ii) a second search system that provides a search service for a second domain, the second domain being different from the first domain;
for each particular candidate transcription of the candidate transcriptions;
receiving, by the one or more computers, data from the first search system that provides the search service for the first domain, the data from the first search system indicating first search results that the search service for the first domain identifies as relevant to the particular candidate transcription;
determining, by the one or more computers, a first score based on the first search results that the search service for the first domain identifies as relevant to the particular candidate transcription;
receiving, by the one or more computers, data from the second search system that provides the search service for the second domain, the data from the second search system indicating second search results that the search service for the second domain identifies as relevant to the particular candidate transcription;
determining, by the one or more computers, a second score based on the second search results that the search service for the second domain identifies as relevant to the particular candidate transcription;
providing, by the one or more computers, (i) the first score that is determined based on the first search results and (ii) the second score that is determined based on the second search results as input to a classifier, wherein the classifier has been trained, using scores that represent characteristics of different search results from different domains, to indicate a likelihood that a transcription is correct based on scores for multiple different domains; and
receiving, by the one or more computers and from the trained classifier, a classifier output in response to at least the first score and the second score, the classifier output indicating a likelihood that the particular candidate transcription is correct;
selecting, by the one or more computers, a transcription for the utterance, from among the multiple candidate transcriptions, based on the classifier outputs; and
providing, by the one or more computers, the transcription as output of the automated speech recognizer.
2 Assignments
0 Petitions
Accused Products
Abstract
In some implementations, data that indicates multiple candidate transcriptions for an utterance is received. For each of the candidate transcriptions, data relating to use of the candidate transcription as a search query is received, a score that is based on the received data is provided to a trained classifier, and a classifier output for the candidate transcription is received. One or more of the candidate transcriptions may be selected based on the classifier outputs.
-
Citations
21 Claims
-
1. A method of performing speech recognition that is performed by one or more computers of an automated speech recognizer, the method comprising:
-
receiving, by the one or more computers, data that indicates multiple candidate transcriptions for an utterance, wherein the one or more computers are in communication with (i) a first search system that provides a search service of a first domain, and (ii) a second search system that provides a search service for a second domain, the second domain being different from the first domain; for each particular candidate transcription of the candidate transcriptions; receiving, by the one or more computers, data from the first search system that provides the search service for the first domain, the data from the first search system indicating first search results that the search service for the first domain identifies as relevant to the particular candidate transcription; determining, by the one or more computers, a first score based on the first search results that the search service for the first domain identifies as relevant to the particular candidate transcription; receiving, by the one or more computers, data from the second search system that provides the search service for the second domain, the data from the second search system indicating second search results that the search service for the second domain identifies as relevant to the particular candidate transcription; determining, by the one or more computers, a second score based on the second search results that the search service for the second domain identifies as relevant to the particular candidate transcription; providing, by the one or more computers, (i) the first score that is determined based on the first search results and (ii) the second score that is determined based on the second search results as input to a classifier, wherein the classifier has been trained, using scores that represent characteristics of different search results from different domains, to indicate a likelihood that a transcription is correct based on scores for multiple different domains; and receiving, by the one or more computers and from the trained classifier, a classifier output in response to at least the first score and the second score, the classifier output indicating a likelihood that the particular candidate transcription is correct; selecting, by the one or more computers, a transcription for the utterance, from among the multiple candidate transcriptions, based on the classifier outputs; and providing, by the one or more computers, the transcription as output of the automated speech recognizer. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 13, 14, 15, 16, 17, 19, 20, 21)
-
-
11. An automated speech recognizer system comprising:
-
one or more computers in communication with (i) a first search system that provides a search service for a first domain, and (ii) a second search system that provides a search service for a second domain, the second domain being different from the first domain; and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising; receiving, by the one or more computers, data that indicates multiple candidate transcriptions for an utterance; for each particular candidate transcription of the candidate transcriptions; providing, by the one or more computers, first query data to the first search system, wherein the first query data specifies the particular candidate transcription as a query to the search service for the first domain; receiving, by the one or more computers, data from the first search system indicating first search results that the search service for the first domain identifies as relevant to the particular candidate transcription; determining, by the one or more computers, a first score based on the first search results that the search service for the first domain identifies as relevant to the particular candidate transcription; providing, by the one or more computers, second query data to the second search system, wherein the second query data specifies the particular candidate transcription as a query to the search service for the second domain; receiving, by the one or more computers, data from the second search system indicating second search results that the search service for the second domain identifies as relevant to the particular candidate transcription; determining, by the one or more computers, a second score based on the second search results that the search service for the second domain identifies as relevant to the particular candidate transcription; providing, by the one or more computers, (i) the first score that is determined based on the first search results and (ii) the second score that is determined based on the second search results as input to a classifier, wherein the classifier has been trained, using scores that represent characteristics of different search results from different domains, to indicate a likelihood that a transcription is correct based on scores for multiple different domains; and receiving, by the one or more computers and from the trained classifier, a classifier output in response to at least the first score and the second score, the classifier output indicating a likelihood that the particular candidate transcription is correct; selecting, by the one or more computers, a transcription for the utterance, from among the multiple candidate transcriptions, based on the classifier outputs; and providing, by the one or more computers, the transcription as output of the automated speech recognizer system.
-
-
12. A non-transitory computer-readable storage device encoded with a computer program, the program comprising instructions that, when executed by one or more computers of an automated speech recognizer, cause the one or more computers to perform operations comprising:
-
receiving, by the one or more computers, data that indicates multiple candidate transcriptions for an utterance, wherein the one or more computers are in communication with (i) a first search system that provides a search service of a first domain, and (ii) a second search system that provides a search service for a second domain, the second domain being different from the first domain; for each particular candidate transcription of the candidate transcriptions; providing, by the one or more computers, first query data to the search service for the first domain, wherein the first query data specifies the particular candidate transcription as a query to the search service for the first domain; receiving, by the one or more computers, data from the first search system that provides the search service for the first domain, the data from the first search system indicating first search results that the search service for the first domain identifies as relevant to the particular candidate transcription; determining, by the one or more computers, a first score based on the first search results that the search service for the first domain identifies as relevant to the particular candidate transcription; providing, by the one or more computers, second query data to the search service for the second domain, wherein the second query data specifies the particular candidate transcription as a query to the search service for the second domain; receiving, by the one or more computers, data from the second search system that provides the search service for the second domain, the data from the second search system indicating second search results that the search service for the second domain identifies as relevant to the particular candidate transcription; determining, by the one or more computers, a second score based on the second search results that the search service for the second domain identifies as relevant to the particular candidate transcription; providing, by the one or more computers, (i) the first score that is determined based on the first search results and (ii) the second score that is determined based on the second search results as input to a classifier, wherein the classifier has been trained, using scores that represent characteristics of different search results from different domains, to indicate a likelihood that a transcription is correct based on scores for multiple different domains; and receiving, by the one or more computers and from the trained classifier, a classifier output in response to at least the first score and the second score, the classifier output indicating a likelihood that the particular candidate transcription is correct; selecting, by the one or more computers, a transcription for the utterance, from among the multiple candidate transcriptions, based on the classifier outputs; and providing, by the one or more computers, the transcription as output of the automated speech recognizer.
-
-
18. A method of performing speech recognition that is performed by data processing apparatus of an automated speech recognizer, the method comprising:
-
receiving, by the data processing apparatus, data that indicates multiple candidate transcriptions for an utterance, wherein the data processing apparatus is in communication with (i) a first search system that provides a search service of a first domain, and (ii) a second search system that provides a search service for a second domain, the second domain being different from the first domain; for each particular candidate transcription of the candidate transcriptions; receiving, by the data processing apparatus, data from the first search system that provides the search service for the first domain, the data from the first search system indicating first search results that the search service for the first domain identifies as relevant to the particular candidate transcription; determining, by the data processing apparatus, a first score based on the first search results that the search service for the first domain identifies as relevant to the particular candidate transcription, wherein the search results in the first set of search results are ranked, wherein determining the first score based on the first search results that the search service for the first domain identifies as relevant to the particular candidate transcription comprises; determining, by the data processing apparatus and as the first score, a score that is indicative of one or more characteristics of the search result that occurs at a particular predetermined position in the ranking of the first search results; receiving, by the data processing apparatus, data from the second search system that provides the search service for the second domain, the data from the second search system indicating second search results that the search service for the second domain identifies as relevant to the particular candidate transcription; determining, by the data processing apparatus, a second score based on the second search results that the search service for the second domain identifies as relevant to the particular candidate transcription; providing, by the data processing apparatus and to a trained classifier, (i) the first score that is determined based on the first search results and (ii) the second score that is determined based on the second search results; and receiving, by the data processing apparatus and from the trained classifier, a classifier output in response to at least the first score and the second score, the classifier output indicating a likelihood that the candidate transcription is a correct transcription; selecting, by the data processing apparatus, from among the multiple candidate transcriptions based on the classifier outputs; and providing, by the data processing apparatus, the transcription as output of the automated speech recognizer.
-
Specification