Corrective feedback loop for automated speech recognition

US 8,793,122 B2
Filed: 09/15/2012
Issued: 07/29/2014
Est. Priority Date: 03/19/2008
Status: Active Grant

First Claim

Patent Images

1. A system comprising:

an electronic data store configured to store;

one or more algorithms that, when executed, implement an automatic speech recognition engine; and

an initial language model; and

a computing device in communication with the electronic data store, the computing device configured to;

obtain audio data comprising speech;

generate a transcription of the speech with the initial language model;

generate an identifier associated with at least one of the audio data or the transcription;

transmit the transcription to a client device for presentation to a user;

transmit the identifier to the client device with the transcription;

receive feedback on the transcription from the client device;

receive the identifier from the client device with the feedback on the transcription; and

based at least in part on the feedback, generate an updated language model.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Audio data that includes speech may be transcribed using a language model. The transcription may be provided to a user. The user may provide feedback on the transcription, and the language model may be updated based at least in part on the feedback. The feedback may include, for example, an affirmation of the transcription; a disapproval of the transcription; a correction to the transcription; a selection of an alternate transcription result; or any other kind of response.

102 Citations

View as Search Results

19 Claims

1. A system comprising:
- an electronic data store configured to store;
  
  one or more algorithms that, when executed, implement an automatic speech recognition engine; and
  
  an initial language model; and
  
  a computing device in communication with the electronic data store, the computing device configured to;
  
  obtain audio data comprising speech;
  
  generate a transcription of the speech with the initial language model;
  
  generate an identifier associated with at least one of the audio data or the transcription;
  
  transmit the transcription to a client device for presentation to a user;
  
  transmit the identifier to the client device with the transcription;
  
  receive feedback on the transcription from the client device;
  
  receive the identifier from the client device with the feedback on the transcription; and
  
  based at least in part on the feedback, generate an updated language model.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The system of claim 1, wherein the feedback comprises at least one of an affirmation of the transcription, a disapproval of the transcription, or a correction to the transcription.
  - 3. The system of claim 1, wherein the computing device is further configured to generate one or more alternate transcriptions of the speech with the initial language model.
  - 4. The system of claim 3, wherein the computing device is further configured to transmit the one or more alternate transcriptions to the client device.
  - 5. The system of claim 4, wherein the feedback comprises a selection of an alternate transcription.
  - 6. The system of claim 4, wherein the one or more alternate transcriptions each have a transcription confidence value that satisfies a threshold.

7. A non-transitory computer-readable medium having stored thereon a computer-executable component configured to execute in one or more processors of a computing device, the computer-executable component being further configured to:
- receive first audio data comprising first speech;
  
  transcribe the first speech with a first language model to generate a first transcription;
  
  generate an identifier associated with at least one of the audio data or the first transcription;
  
  provide the first transcription to a first client device;
  
  provide the identifier to the first client device with the first transcription;
  
  receive feedback on the first transcription from the first client device;
  
  receive the identifier from the first client device with the feedback on the first transcription; and
  
  based at least in part on the feedback on the first transcription, update the first language model.
- View Dependent Claims (8, 9, 10, 11, 12, 13)
- - 8. The non-transitory computer-readable medium of claim 7, wherein the computer-executable component is further configured to:
    - select a second language model; and
      
      based at least in part on the feedback on the transcription, update the second language model.
  - 9. The non-transitory computer-readable medium of claim 7, wherein the second language model is not used to generate the first transcription.
  - 10. The non-transitory computer-readable medium of claim 7, wherein:
    - the first audio data comprising speech is associated with a user of the first client device; and
      
      the first language model is associated with the user of the first client device.
  - 11. The non-transitory computer-readable medium of claim 7, wherein the computer-executable component is further configured to:
    - receive second audio data comprising second speech; and
      
      transcribe the second speech with the updated first language model to generate a second transcription.
  - 12. The non-transitory computer-readable medium of claim 7, wherein the first audio data comprising first speech is received from the first client device.
  - 13. The non-transitory computer-readable medium of claim 7, wherein the first audio data comprising first speech is received from a second client device.

14. A computer-implemented method comprising:
- under control of one or more computing devices configured with specific computer-executable instructions,receiving audio data comprising speech;
  
  generating speech recognition results from the speech using a first language model;
  
  generating an identifier associated with at least one of the audio data or the speech recognition results;
  
  providing the speech recognition results to a first client device;
  
  providing the identifier to the first client device with the speech recognition results;
  
  receiving feedback on the speech recognition results from the first client device;
  
  receiving the identifier from the first client device with the feedback on the speech recognition results; and
  
  updating the first language model based at least in part on the feedback.
- View Dependent Claims (15, 16, 17, 18, 19)
- - 15. The computer-implemented method of claim 14, wherein the audio data is received from the first client device.
  - 16. The computer-implemented method of claim 15 further comprising receiving an identifier of an application from the first client device, and wherein the first language model is associated with the application.
  - 17. The computer-implemented method of claim 14, wherein the audio data is received from a second client device.
  - 18. The computer-implemented method of claim 14, wherein the speech recognition results comprise a transcription of the speech.
  - 19. The computer-implemented method of claim 18, wherein the feedback relates to at least one of a letter of the transcription, a syllable of the transcription, a word of the transcription, a phrase of the transcription, or a sentence of the transcription.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Amazon Technologies, Inc. (Amazon.com, Inc.)
Original Assignee
Canyon IP Holdings LLC (Intellectual Ventures LLC)
Inventors
White, Marc, Jablokov, Igor Roditis, Jablokov, Victor Roditis
Primary Examiner(s)
Abebe, Daniel D

Application Number

US13/621,189
Publication Number

US 20130024195A1
Time in Patent Office

682 Days
Field of Search

704/9, 704/231, 704/255, 704/270, 369/25.01
US Class Current

704/9
CPC Class Codes

G06F 3/0236   using selection techniques ...

G10L 15/183   using context dependencies,...

G10L 15/19   Grammatical context, e.g. d...

G10L 15/22   Procedures used during a sp...

G10L 15/26   Speech to text systems G10L...

G10L 15/30   Distributed recognition, e....

G10L 2015/0631   Creating reference template...

Corrective feedback loop for automated speech recognition

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

102 Citations

19 Claims

Specification

Solutions

Use Cases

Quick Links

Corrective feedback loop for automated speech recognition

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

102 Citations

19 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links