Multiple speech locale-specific hotword classifiers for selection of a speech locale
First Claim
1. A computer-implemented method comprising:
- receiving, by a mobile computing device that is configured to exit a low power mode upon detection of one of a set of predefined hotwords that are each associated with a respective language or dialect, audio data corresponding to a user speaking a particular, predefined hotword of the set;
in response to receiving the audio data corresponding to the user speaking the particular, predefined hotword,providing acoustic features of the audio data to multiple hotword classifiers, wherein each hotword classifier is (i) associated with a single language or single dialect of language and (ii) configured to classify acoustic features as either corresponding to, or not corresponding to, an utterance of a respective predefined term in the associated single language or single dialect of language without transcribing words corresponding to the acoustic features and without semantically interpreting the acoustic features; and
identifying a respective language or dialect associated with the particular, predefined hotword by determining one hotword classifier of the multiple hotword classifiers that classifies the particular, predefined hotword as corresponding to an utterance of a respective predefined term in the associated single language or single dialect of language of the hotword classifier; and
generating a transcription of subsequently received audio data by an automated speech recognizer that is configured for the identified respective language or dialect associated with the particular, predefined hotword.
2 Assignments
0 Petitions
Accused Products
Abstract
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for recognizing speech in an utterance. The methods, systems, and apparatus include actions of receiving an utterance and obtaining acoustic features from the utterance. Further actions include providing the acoustic features from the utterance to multiple speech locale-specific hotword classifiers. Each speech locale-specific hotword classifier (i) may be associated with a respective speech locale, and (ii) may be configured to classify audio features as corresponding to, or as not corresponding to, a respective predefined term. Additional actions may include selecting a speech locale for use in transcribing the utterance based on one or more results from the multiple speech locale-specific hotword classifiers in response to providing the acoustic features from the utterance to the multiple speech locale-specific hotword classifiers. Further actions may include selecting parameters for automated speech recognition based on the selected speech locale.
118 Citations
20 Claims
-
1. A computer-implemented method comprising:
-
receiving, by a mobile computing device that is configured to exit a low power mode upon detection of one of a set of predefined hotwords that are each associated with a respective language or dialect, audio data corresponding to a user speaking a particular, predefined hotword of the set; in response to receiving the audio data corresponding to the user speaking the particular, predefined hotword, providing acoustic features of the audio data to multiple hotword classifiers, wherein each hotword classifier is (i) associated with a single language or single dialect of language and (ii) configured to classify acoustic features as either corresponding to, or not corresponding to, an utterance of a respective predefined term in the associated single language or single dialect of language without transcribing words corresponding to the acoustic features and without semantically interpreting the acoustic features; and identifying a respective language or dialect associated with the particular, predefined hotword by determining one hotword classifier of the multiple hotword classifiers that classifies the particular, predefined hotword as corresponding to an utterance of a respective predefined term in the associated single language or single dialect of language of the hotword classifier; and generating a transcription of subsequently received audio data by an automated speech recognizer that is configured for the identified respective language or dialect associated with the particular, predefined hotword. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A system comprising:
-
one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising; receiving, by a mobile computing device that is configured to exit a low power mode upon detection of one of a set of predefined hotwords that are each associated with a respective language or dialect, audio data corresponding to a user speaking a particular, predefined hotword of the set; and in response to receiving the audio data corresponding to the user speaking the particular, predefined hotword, providing acoustic features of the audio data to multiple hotword classifiers, wherein each hotword classifier is (i) associated with a single language or single dialect of language and (ii) configured to classify acoustic features as either corresponding to, or not corresponding to, an utterance of a respective predefined term in the associated single language or single dialect of language without transcribing words corresponding to the acoustic features and without semantically interpreting the acoustic features; identifying a respective language or dialect associated with the particular, predefined hotword by determining one hotword classifier of the multiple hotword classifiers that classifies the particular, predefined hotword as corresponding to an utterance of a respective predefined term in the associated single language or single dialect of language of the hotword classifier; and generating a transcription of subsequently received audio data by an automated speech recognizer that is configured for the identified respective language or dialect associated with the particular, predefined hotword. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A non-transitory computer-readable medium storing instructions executable by one or more computers which, upon such execution, cause the one or more computers to perform operations comprising:
-
receiving, by a mobile computing device that is configured to exit a low power mode upon detection of one of a set of predefined hotwords that are each associated with a respective language or dialect, audio data corresponding to a user speaking a particular, predefined hotword of the set; in response to receiving the audio data corresponding to the user speaking the particular, predefined hotword, providing acoustic features of the audio data to multiple hotword classifiers, wherein each hotword classifier is (i) associated with a single language or single dialect of language and (ii) configured to classify acoustic features as either corresponding to, or not corresponding to, an utterance of a respective predefined term in the associated single language or single dialect of language without transcribing words corresponding to the acoustic features and without semantically interpreting the acoustic features; and identifying a respective language or dialect associated with the particular, predefined hotword by determining one hotword classifier of the multiple hotword classifiers that classifies the particular, predefined hotword as corresponding to an utterance of a respective predefined term in the associated single language or single dialect of language of the hotword classifier; and generating a transcription of subsequently received audio data by an automated speech recognizer that is configured for the identified respective language or dialect associated with the particular, predefined hotword. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification