Recognizing accented speech
First Claim
Patent Images
1. A computer-implemented method comprising:
- receiving, from a user, an utterance that was spoken while focus is set on a field of a form;
determining, from among one or more different field types, a field type associated with the field;
determining, from among different, predefined levels of speech recognition accuracy and from among different, predefined levels of speech recognition latency, a predefined level of speech recognition accuracy and a predefined level of speech recognition latency that are indicated as acceptable for the field type;
selecting, based at least on the predefined level of speech recognition accuracy and a predefined level of speech recognition latency that are indicated as acceptable for the field type, (i) one or more accent libraries that each include phonemes for different pronunciations for words of a language and (ii) a level of correction for a speech recognition system to apply to a transcription;
obtaining, from the speech recognition system, the transcription of the utterance that is generated by the speech recognition system; and
providing the transcription of the utterance in the field of the form.
2 Assignments
0 Petitions
Accused Products
Abstract
Techniques (300, 400, 500) and apparatuses (100, 200, 700) for recognizing accented speech are described. In some embodiments, an accent module recognizes accented speech using an accent library based on device data, uses different speech recognition correction levels based on an application field into which recognized words are set to be provided, or updates an accent library based on corrections made to incorrectly recognized speech.
-
Citations
18 Claims
-
1. A computer-implemented method comprising:
-
receiving, from a user, an utterance that was spoken while focus is set on a field of a form; determining, from among one or more different field types, a field type associated with the field; determining, from among different, predefined levels of speech recognition accuracy and from among different, predefined levels of speech recognition latency, a predefined level of speech recognition accuracy and a predefined level of speech recognition latency that are indicated as acceptable for the field type; selecting, based at least on the predefined level of speech recognition accuracy and a predefined level of speech recognition latency that are indicated as acceptable for the field type, (i) one or more accent libraries that each include phonemes for different pronunciations for words of a language and (ii) a level of correction for a speech recognition system to apply to a transcription; obtaining, from the speech recognition system, the transcription of the utterance that is generated by the speech recognition system; and providing the transcription of the utterance in the field of the form. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A system comprising:
one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising; receiving, from a user, an utterance that was spoken while focus is set on a field of a form; determining, from among one or more different field types, a field type associated with the field; determining, from among different, predefined levels of speech recognition accuracy and from among different, predefined levels of speech recognition latency, a predefined level of speech recognition accuracy and a predefined level of speech recognition latency that are indicated as acceptable for the field type; selecting, based at least on the predefined level of speech recognition accuracy and a predefined level of speech recognition latency that are indicated as acceptable for the field type, (i) one or more accent libraries that each include phonemes for different pronunciations for words of a language and (ii) a level of correction for a speech recognition system to apply to a transcription; obtaining, from the speech recognition system, the transcription of the utterance that is generated by the speech recognition system; and providing the transcription of the utterance in the field of the form. - View Dependent Claims (8, 9, 10, 11, 12)
-
13. A non-transitory computer-readable medium storing software comprising instructions executable by one or more computers which, upon such execution, cause the one or more computers to perform operations comprising:
-
receiving, from a user, an utterance that was spoken while focus is set on a field of a form; determining, from among one or more different field types, a field type associated with the field; determining, from among different, predefined levels of speech recognition accuracy and from among different, predefined levels of speech recognition latency, a predefined level of speech recognition accuracy and a predefined level of speech recognition latency that are indicated as acceptable for the field type; selecting, based at least on the predefined level of speech recognition accuracy and a predefined level of speech recognition latency that are indicated as acceptable for the field type, (i) one or more accent libraries that each include phonemes for different pronunciations for words of a language and (ii) a level of correction for a speech recognition system to apply to a transcription; obtaining, from the speech recognition system, the transcription of the utterance that is generated by the speech recognition system; and providing the transcription of the utterance in the field of the form. - View Dependent Claims (14, 15, 16, 17, 18)
-
Specification