Recognizing accented speech

US 9,734,819 B2
Filed: 02/21/2013
Issued: 08/15/2017
Est. Priority Date: 02/21/2013
Status: Active Grant

First Claim

Patent Images

1. A computer-implemented method comprising:

receiving, from a user, an utterance that was spoken while focus is set on a field of a form;

determining, from among one or more different field types, a field type associated with the field;

determining, from among different, predefined levels of speech recognition accuracy and from among different, predefined levels of speech recognition latency, a predefined level of speech recognition accuracy and a predefined level of speech recognition latency that are indicated as acceptable for the field type;

selecting, based at least on the predefined level of speech recognition accuracy and a predefined level of speech recognition latency that are indicated as acceptable for the field type, (i) one or more accent libraries that each include phonemes for different pronunciations for words of a language and (ii) a level of correction for a speech recognition system to apply to a transcription;

obtaining, from the speech recognition system, the transcription of the utterance that is generated by the speech recognition system; and

providing the transcription of the utterance in the field of the form.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Techniques (300, 400, 500) and apparatuses (100, 200, 700) for recognizing accented speech are described. In some embodiments, an accent module recognizes accented speech using an accent library based on device data, uses different speech recognition correction levels based on an application field into which recognized words are set to be provided, or updates an accent library based on corrections made to incorrectly recognized speech.

Citations

18 Claims

1. A computer-implemented method comprising:
- receiving, from a user, an utterance that was spoken while focus is set on a field of a form;
  
  determining, from among one or more different field types, a field type associated with the field;
  
  determining, from among different, predefined levels of speech recognition accuracy and from among different, predefined levels of speech recognition latency, a predefined level of speech recognition accuracy and a predefined level of speech recognition latency that are indicated as acceptable for the field type;
  
  selecting, based at least on the predefined level of speech recognition accuracy and a predefined level of speech recognition latency that are indicated as acceptable for the field type, (i) one or more accent libraries that each include phonemes for different pronunciations for words of a language and (ii) a level of correction for a speech recognition system to apply to a transcription;
  
  obtaining, from the speech recognition system, the transcription of the utterance that is generated by the speech recognition system; and
  
  providing the transcription of the utterance in the field of the form.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The method of claim 1, comprising:
    - selecting the one or more accent libraries based on personal data that is associated with the user and is stored on a computing device that receives the utterance.
  - 3. The method of claim 2, wherein selecting the one or more accent libraries based on personal data that is associated with the user and is stored on a computing device that receives the utterance comprises:
    - identifying countries of addresses stored in an address book; and
      
      selecting the one or more accent libraries based on the countries and on a current location of the computing device.
  - 4. The method of claim 1, wherein the one or more accent libraries are associated with a default language of the computing device.
  - 5. The method of claim 1, wherein the speech recognition system generates the transcription of the utterance without accessing a linguistic library.
  - 6. The method of claim 1, wherein obtaining, from the speech recognition system, the transcription of the utterance that is generated by the speech recognition system comprises:
    - obtaining, from the speech recognition system, the transcription of the utterance that is generated by the speech recognition system using the one or more accent libraries.

7. A system comprising:
- one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising;
  
  receiving, from a user, an utterance that was spoken while focus is set on a field of a form;
  
  determining, from among one or more different field types, a field type associated with the field;
  
  determining, from among different, predefined levels of speech recognition accuracy and from among different, predefined levels of speech recognition latency, a predefined level of speech recognition accuracy and a predefined level of speech recognition latency that are indicated as acceptable for the field type;
  
  selecting, based at least on the predefined level of speech recognition accuracy and a predefined level of speech recognition latency that are indicated as acceptable for the field type, (i) one or more accent libraries that each include phonemes for different pronunciations for words of a language and (ii) a level of correction for a speech recognition system to apply to a transcription;
  
  obtaining, from the speech recognition system, the transcription of the utterance that is generated by the speech recognition system; and
  
  providing the transcription of the utterance in the field of the form.
- View Dependent Claims (8, 9, 10, 11, 12)
- - 8. The system of claim 7, wherein the operations further comprise:
    - selecting the one or more accent libraries based on personal data that is associated with the user and is stored on a computing device that receives the utterance.
  - 9. The system of claim 8, wherein selecting the one or more accent libraries based on personal data that is associated with the user and is stored on a computing device that receives the utterance comprises:
    - identifying countries of addresses stored in an address book; and
      
      selecting the one or more accent libraries based on the countries and on a current location of the computing device.
  - 10. The system of claim 7, wherein the one or more accent libraries are associated with a default language of the computing device.
  - 11. The system of claim 7, wherein the speech recognition system generates the transcription of the utterance without accessing a linguistic library.
  - 12. The system of claim 7, wherein obtaining, from the speech recognition system, the transcription of the utterance that is generated by the speech recognition system comprises:
    - obtaining, from the speech recognition system, the transcription of the utterance that is generated by the speech recognition system using the one or more accent libraries.

13. A non-transitory computer-readable medium storing software comprising instructions executable by one or more computers which, upon such execution, cause the one or more computers to perform operations comprising:
- receiving, from a user, an utterance that was spoken while focus is set on a field of a form;
  
  determining, from among one or more different field types, a field type associated with the field;
  
  determining, from among different, predefined levels of speech recognition accuracy and from among different, predefined levels of speech recognition latency, a predefined level of speech recognition accuracy and a predefined level of speech recognition latency that are indicated as acceptable for the field type;
  
  selecting, based at least on the predefined level of speech recognition accuracy and a predefined level of speech recognition latency that are indicated as acceptable for the field type, (i) one or more accent libraries that each include phonemes for different pronunciations for words of a language and (ii) a level of correction for a speech recognition system to apply to a transcription;
  
  obtaining, from the speech recognition system, the transcription of the utterance that is generated by the speech recognition system; and
  
  providing the transcription of the utterance in the field of the form.
- View Dependent Claims (14, 15, 16, 17, 18)
- - 14. The medium of claim 13, comprising:
    - selecting the one or more accent libraries based on personal data that is associated with the user and is stored on a computing device that receives the utterance.
  - 15. The medium of claim 14, wherein selecting the one or more accent libraries based on personal data that is associated with the user and is stored on a computing device that receives the utterance comprises:
    - identifying countries of addresses stored in an address book; and
      
      selecting the one or more accent libraries based on the countries and on a current location of the computing device.
  - 16. The medium of claim 13, wherein the one or more accent libraries are associated with a default language of the computing device.
  - 17. The medium of claim 13, wherein the speech recognition system generates the transcription of the utterance without accessing a linguistic library.
  - 18. The medium of claim 13, wherein obtaining, from the speech recognition system, the transcription of the utterance that is generated by the speech recognition system comprises:
    - obtaining, from the speech recognition system, the transcription of the utterance that is generated by the speech recognition system using the one or more accent libraries.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Google Technology Holdings LLC (Alphabet Inc.)
Original Assignee
Google Technology Holdings LLC (Alphabet Inc.)
Inventors
Gray, Kristin A.
Primary Examiner(s)
Ky, Kevin

Application Number

US13/772,373
Publication Number

US 20140236595A1
Time in Patent Office

1,636 Days
Field of Search

None
US Class Current
CPC Class Codes

G06F 40/174   Form filling; Merging

G10L 15/00   Speech recognition G10L17/0...

G10L 15/005   Language recognition

G10L 15/01   Assessment or evaluation of...

G10L 15/063   Training

G10L 15/1807   using prosody or stress

G10L 15/187   Phonemic context, e.g. pron...

G10L 15/22   Procedures used during a sp...

G10L 15/30   Distributed recognition, e....

G10L 2015/0635   updating or merging of old ...

G10L 2015/227   of the speaker; Human-fact...

Recognizing accented speech

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

18 Claims

Specification

Solutions

Use Cases

Quick Links

Recognizing accented speech

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

18 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links