Recognizing accented speech

US 10,242,661 B2
Filed: 03/21/2017
Issued: 03/26/2019
Est. Priority Date: 02/21/2013
Status: Active Grant

First Claim

Patent Images

1. A computer-implemented method comprising:

receiving, at a client device and from a server-based automated speech recognizer that has access to multiple accent libraries, a transcription of an utterance received at the client device;

providing, for output on a display of the client device, the transcription of the utterance received at the client device;

receiving, at the client device and from a user, an additional transcription and data indicating that the additional transcription is a correction to the transcription of the utterance that was incorrectly recognized by the server-based automated speech recognizer that has access to the multiple accent libraries;

in response to receiving the additional transcription and the data indicating that the additional transcription is a correction to the transcription of the utterance, accessing data stored on the client device;

based on the data stored on the client device, identifying, by the client device, an accent library of the multiple accent libraries to be updated using the additional transcription;

transmitting, by the client device and to the server-based automated speech recognizer, a request to update the accent library of the multiple accent libraries based at least on the additional transcription and the data indicating that the additional transcription is a correction to the transcription of the utterance;

receiving, by the client device, a respective update for the accent library of the multiple accent libraries; and

obtaining a transcription of a subsequently received utterance by the server-based automated speech recognizer that has access to the multiple accent libraries including the updated accent library.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Techniques (300, 400, 500) and apparatuses (100, 200, 700) for recognizing accented speech are described. In some embodiments, an accent module recognizes accented speech using an accent library based on device data, uses different speech recognition correction levels based on an application field into which recognized words are set to be provided, or updates an accent library based on corrections made to incorrectly recognized speech.

Citations

20 Claims

1. A computer-implemented method comprising:
- receiving, at a client device and from a server-based automated speech recognizer that has access to multiple accent libraries, a transcription of an utterance received at the client device;
  
  providing, for output on a display of the client device, the transcription of the utterance received at the client device;
  
  receiving, at the client device and from a user, an additional transcription and data indicating that the additional transcription is a correction to the transcription of the utterance that was incorrectly recognized by the server-based automated speech recognizer that has access to the multiple accent libraries;
  
  in response to receiving the additional transcription and the data indicating that the additional transcription is a correction to the transcription of the utterance, accessing data stored on the client device;
  
  based on the data stored on the client device, identifying, by the client device, an accent library of the multiple accent libraries to be updated using the additional transcription;
  
  transmitting, by the client device and to the server-based automated speech recognizer, a request to update the accent library of the multiple accent libraries based at least on the additional transcription and the data indicating that the additional transcription is a correction to the transcription of the utterance;
  
  receiving, by the client device, a respective update for the accent library of the multiple accent libraries; and
  
  obtaining a transcription of a subsequently received utterance by the server-based automated speech recognizer that has access to the multiple accent libraries including the updated accent library.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The method of claim 1, comprising:
    - based on the data stored on the client device, determining demographic data of the user of the client device,wherein identifying the accent library of the multiple accent libraries to be updated using the additional transcription is based on the demographic data of the user of the client device.
  - 3. The method of claim 2, wherein the demographic data of the user of the client device includes an age range, gender, native language, and a geographic location where the user is located.
  - 4. The method of claim 2, comprising:
    - obtaining an additional transcription of an additional subsequently received utterance by the server-based automated speech recognizer using the multiple accent libraries based on demographic data of an additional user of the computing device who provided the additional subsequently received utterance not corresponding to the demographic data of the user.
  - 5. The method of claim 2, wherein obtaining a transcription of a subsequently received utterance by the server-based automated speech recognizer using the updated multiple accent libraries comprises:
    - obtaining the transcription of the subsequently received utterance by the server-based automated speech recognizer using the updated multiple accent libraries based on demographic data of an additional user of the computing device who provided the additional subsequently received utterance corresponding to the demographic data of the user.
  - 6. The method of claim 2, wherein the demographic data of the user is based on countries of addresses stored in an address book of the client device.
  - 7. The method of claim 1, wherein:
    - the subsequently received utterance is received from a different user from the user.

8. A system comprising:
- one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising;
  
  receiving, at a client device and from a server-based automated speech recognizer that has access to multiple accent libraries, a transcription of an utterance received at the client device;
  
  providing, for output on a display of the client device, the transcription of the utterance received at the client device;
  
  receiving, at the client device and from a user, an additional transcription and data indicating that the additional transcription is a correction to the transcription of the utterance that was incorrectly recognized by the server-based automated speech recognizer that has access to the multiple accent libraries;
  
  in response to receiving the additional transcription and the data indicating that the additional transcription is a correction to the transcription of the utterance, accessing data stored on the client device;
  
  based on the data stored on the client device, identifying, by the client device, an accent library of the multiple accent libraries to be updated using the additional transcription;
  
  transmitting, by the client device and to the server-based automated speech recognizer, a request to update the accent library of the multiple accent libraries based at least on the additional transcription and the data indicating that the additional transcription is a correction to the transcription of the utterance;
  
  receiving, by the client device, a respective update for the accent library of the multiple accent libraries; and
  
  obtaining a transcription of a subsequently received utterance by the server-based automated speech recognizer that has access to the multiple accent libraries including the updated accent library.
- View Dependent Claims (9, 10, 11, 12, 13, 14)
- - 9. The system of claim 8, comprising:
    - based on the data stored on the client device, determining demographic data of the user of the client device,wherein identifying the accent library of the multiple accent libraries to be updated using the additional transcription is based on the demographic data of the user of the client device.
  - 10. The system of claim 9, wherein the demographic data of the user of the client device includes an age range, gender, native language, and a geographic location where the user is located.
  - 11. The system of claim 9, wherein the operations further comprise:
    - obtaining an additional transcription of an additional subsequently received utterance by the server-based automated speech recognizer using the multiple accent libraries based on demographic data of an additional user of the computing device who provided the additional subsequently received utterance not corresponding to the demographic data of the user.
  - 12. The system of claim 9, wherein obtaining a transcription of a subsequently received utterance by the server-based automated speech recognizer using the updated multiple accent libraries comprises:
    - obtaining the transcription of the subsequently received utterance by the server-based automated speech recognizer using the updated multiple accent libraries based on demographic data of an additional user of the computing device who provided the additional subsequently received utterance corresponding to the demographic data of the user.
  - 13. The system of claim 9, wherein the demographic data of the user is based on countries of addresses stored in an address book of the client device.
  - 14. The system of claim 8, wherein:
    - the subsequently received utterance is received from a different user from the user.

15. A non-transitory computer-readable medium storing software comprising instructions executable by one or more computers which, upon such execution, cause the one or more computers to perform operations comprising:
- receiving, at a client device and from a server-based automated speech recognizer that has access to multiple accent libraries, a transcription of an utterance received at the client device;
  
  providing, for output on a display of the client device, the transcription of the utterance received at the client device;
  
  receiving, at the client device and from a user, an additional transcription and data indicating that the additional transcription is a correction to the transcription of the utterance that was incorrectly recognized by the server-based automated speech recognizer that has access to the multiple accent libraries;
  
  in response to receiving the additional transcription and the data indicating that the additional transcription is a correction to the transcription of the utterance, accessing data stored on the client device;
  
  based on the data stored on the client device, identifying, by the client device, an accent library of the multiple accent libraries to be updated using the additional transcription;
  
  transmitting, by the client device and to the server-based automated speech recognizer, a request to update the accent library of the multiple accent libraries based at least on the additional transcription and the data indicating that the additional transcription is a correction to the transcription of the utterance;
  
  receiving, by the client device, a respective update for the accent library of the multiple accent libraries; and
  
  obtaining a transcription of a subsequently received utterance by the server-based automated speech recognizer that has access to the multiple accent libraries including the updated accent library.
- View Dependent Claims (16, 17, 18, 19, 20)
- - 16. The medium of claim 15, comprising:
    - based on the data stored on the client device, determining demographic data of the user of the client device,wherein identifying the accent library of the multiple accent libraries to be updated using the additional transcription is based on the demographic data of the user of the client device.
  - 17. The medium of claim 16, wherein the demographic data of the user of the client device includes an age range, gender, native language, and a geographic location where the user is located.
  - 18. The medium of claim 16, wherein the operations further comprise:
    - obtaining an additional transcription of an additional subsequently received utterance by the server-based automated speech recognizer using the multiple accent libraries based on demographic data of an additional user of the computing device who provided the additional subsequently received utterance not corresponding to the demographic data of the user.
  - 19. The medium of claim 16, wherein obtaining a transcription of a subsequently received utterance by the server-based automated speech recognizer using the updated multiple accent libraries comprises:
    - obtaining the transcription of the subsequently received utterance by the server-based automated speech recognizer using the updated multiple accent libraries based on demographic data of an additional user of the computing device who provided the additional subsequently received utterance corresponding to the demographic data of the user.
  - 20. The medium of claim 15, wherein:
    - the subsequently received utterance is received from a different user from the user.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Google LLC (Alphabet Inc.)
Original Assignee
Google Technology Holdings LLC (Alphabet Inc.)
Inventors
Gray, Kristin A.
Primary Examiner(s)
Ky, Kevin

Application Number

US15/464,668
Publication Number

US 20170193989A1
Time in Patent Office

735 Days
Field of Search

None
US Class Current
CPC Class Codes

G06F 40/174   Form filling; Merging

G10L 15/00   Speech recognition G10L17/0...

G10L 15/005   Language recognition

G10L 15/01   Assessment or evaluation of...

G10L 15/063   Training

G10L 15/1807   using prosody or stress

G10L 15/187   Phonemic context, e.g. pron...

G10L 15/22   Procedures used during a sp...

G10L 15/30   Distributed recognition, e....

G10L 2015/0635   updating or merging of old ...

G10L 2015/227   of the speaker; Human-fact...

Recognizing accented speech

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Recognizing accented speech

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links