Corrective feedback loop for automated speech recognition

US 9,940,931 B2
Filed: 07/01/2016
Issued: 04/10/2018
Est. Priority Date: 04/05/2007
Status: Active Grant

First Claim

Patent Images

1. A computer-implemented method comprising:

under control of a computing device configured with specific computer-executable instructions,generating audio data comprising speech;

transmitting the audio data to a remote computing system including a speech recognition engine;

receiving, from the remote computing system, a plurality of transcription results for a portion of a transcription of the speech, wherein the transcription has been generated from the audio data by the speech recognition engine;

receiving, from the remote computing system, a confidence level for each transcription result of the plurality of transcription results, wherein the confidence level for each transcription result has been generated by the speech recognition engine, and wherein the confidence level for each transcription result of the plurality of transcription results represents a confidence in an accuracy of the transcription result;

determining a ranked order for the plurality of transcription results from the confidence levels of the plurality of transcription results;

presenting the plurality of transcription results for the portion of the transcription in the ranked order, with each transcription result of the plurality of transcription results presented with the confidence level for the transcription result; and

receiving a selection, from the plurality of transcription results, of a first transcription result for the portion of the transcription.

View all claims

0 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method for facilitating the updating of a language model includes receiving, at a client device, via a microphone, an audio message corresponding to speech of a user; communicating the audio message to a first remote server; receiving, that the client device, a result, transcribed at the first remote server using an automatic speech recognition system (“ASR”), from the audio message; receiving, at the client device from the user, an affirmation of the result; storing, at the client device, the result in association with an identifier corresponding to the audio message; and communicating, to a second remote server, the stored result together with the identifier.

Citations

16 Claims

1. A computer-implemented method comprising:
- under control of a computing device configured with specific computer-executable instructions,generating audio data comprising speech;
  
  transmitting the audio data to a remote computing system including a speech recognition engine;
  
  receiving, from the remote computing system, a plurality of transcription results for a portion of a transcription of the speech, wherein the transcription has been generated from the audio data by the speech recognition engine;
  
  receiving, from the remote computing system, a confidence level for each transcription result of the plurality of transcription results, wherein the confidence level for each transcription result has been generated by the speech recognition engine, and wherein the confidence level for each transcription result of the plurality of transcription results represents a confidence in an accuracy of the transcription result;
  
  determining a ranked order for the plurality of transcription results from the confidence levels of the plurality of transcription results;
  
  presenting the plurality of transcription results for the portion of the transcription in the ranked order, with each transcription result of the plurality of transcription results presented with the confidence level for the transcription result; and
  
  receiving a selection, from the plurality of transcription results, of a first transcription result for the portion of the transcription.
- View Dependent Claims (2, 3)
- - 2. The computer-implemented method of claim 1, further comprising storing the transcription with the first transcription result for the portion of the transcription.
  - 3. The computer-implemented method of claim 1, wherein at least two transcription results of the plurality of transcription results satisfy a threshold confidence level, and further comprising:
    - determining which transcription results of the plurality of transcription results for the portion have a confidence level satisfying a threshold confidence level, andwherein presenting the plurality of transcription results for the portion of the transcription comprises presenting the at least two transcription results, in the ranked order, that have a confidence level satisfying the threshold confidence level, with each transcription result of the at least two transcription results presented with the confidence level for the transcription result.

4. A computer-implemented method comprising:
- generating, at a computing device, audio data comprising speech;
  
  transmitting, by the computing device, the audio data to a remote computing system, wherein the remote computing system includes a speech recognition engine;
  
  receiving, at the computing device, a transcription of the speech from the remote computing system, wherein the transcription that has been generated from the audio data by the speech recognition engine of the remote computing system, and wherein the transcription includes a portion having a first transcription result and a second transcription result;
  
  receiving, at the computing device, a confidence level for the first transcription result and a confidence level for the second transcription result, wherein the confidence level for the first transcription result and the confidence level for the second transcription result have been generated by the speech recognition engine, and wherein the confidence level for the first transcription result is greater than the confidence level for the second transcription result;
  
  presenting, at the computing device, the transcription with the first transcription result for the portion;
  
  presenting, at the computing device, the second transcription result for the portion for selection as an alternative to the first transcription result for the portion;
  
  presenting at the computing device, at least the confidence level for the second transcription result; and
  
  receiving, at the computing device, selection of the second transcription result as the alternative to the first transcription result for the portion.
- View Dependent Claims (5, 6, 7)
- - 5. The computer-implemented method of claim 4, further comprising replacing, in the transcription, the first transcription result for the portion with the second transcription result for the portion.
  - 6. The computer-implemented method of claim 4 further comprising transmitting the second transcription result for the portion to the remote computing system for replacing, in the transcription, the first transcription result for the portion with the second transcription result for the portion.
  - 7. The computer-implemented method of claim 4, wherein presenting at the computing device, the confidence level for at least the second transcription result comprises presenting the confidence level for the first transcription result and the confidence level for second transcription result.

8. A device comprising:
- a microphone configured to capture speech;
  
  a memory configured to store audio data corresponding to the speech; and
  
  a processor in communication with the microphone and the memory, the processor configured to execute specific computer-executable instructions to at least;
  
  provide the audio data to a speech recognition system;
  
  receive a transcription of the speech from the speech recognition system, wherein the transcription that has been generated from the audio data by the speech recognition system, and wherein the transcription includes a portion having a first transcription result and a second transcription result;
  
  receive a confidence level for the first transcription result and a confidence level for the second transcription result, wherein the confidence level for the first transcription result and the confidence level for the second transcription result have been generated by the speech recognition engine, and wherein the confidence level for the first transcription result is greater than the confidence level for the second transcription result;
  
  present the transcription with the first transcription result for the portion;
  
  present the second transcription result for the portion for selection as an alternative to the first transcription result for the portion;
  
  present at least the confidence level for the second transcription result; and
  
  receive a selection of the second transcription result as the alternative to the first transcription result for the portion.
- View Dependent Claims (9, 10, 11, 12)
- - 9. The device of claim 8, wherein the processor is configured to further execute the specific computer-executable instructions to at least replace, in the transcription, the first transcription result for the portion with the second transcription result for the portion.
  - 10. The device of claim 8, wherein the processor is configured to further execute the specific computer-executable instructions to at least transmit the second transcription result for the portion to the speech recognition system for replacing, in the transcription, the first transcription result for the portion with the second transcription result for the portion.
  - 11. The device of claim 8, wherein the processor is configured to execute the specific computer-executable instructions present the confidence level for the first transcription result and the confidence level for the second transcription result.
  - 12. The device of claim 8, wherein the processor is configured to further execute the specific computer-executable instructions to at least transmit information regarding the second transcription result to the speech recognition for updating a language model used by the speech recognition engine to generate another transcription.

13. A computing device comprising:
- a memory configured to store audio data comprising speech; and
  
  a processor in communication with the memory, the processor configured to execute specific computer-executable instructions to at least;
  
  transmit the audio data comprising speech to a remote computing system including a speech recognition engine;
  
  receive, from the remote computing system, a plurality of transcription results for a portion of a transcription of the speech, wherein the transcription has been generated from the audio data by the speech recognition engine;
  
  receive, from the remote computing system, a confidence level for each transcription result of the plurality of transcription results, wherein the confidence level for each transcription result has been generated by the speech recognition engine, and wherein the confidence level for each transcription result of the plurality of transcription results represents a confidence in an accuracy of the transcription result; and
  
  determine a ranked order for the plurality of transcription results from the confidence level'"'"'s of the plurality of transcription results;
  
  present the plurality of transcription results for the portion of the transcription in the ranked order, with each transcription result of the plurality of transcription results presented with the confidence level for the transcription result; and
  
  receive a selection, from the plurality of transcription results, of a first transcription result for the portion of the transcription.
- View Dependent Claims (14, 15, 16)
- - 14. The computing device of claim 13, wherein the processor is configured to further execute the specific computer-executable instructions to at least store the transcription with the first transcription result for the portion of the transcription.
  - 15. The computing device of claim 13, wherein at least two transcription results of the plurality of transcription results satisfy a threshold confidence level, and wherein the processor is configured to further execute the specific computer-executable instructions to at least:
    - determine which transcription results of the plurality of transcription results for the portion have a confidence level satisfying a threshold confidence level, andpresent the at least two transcription results that have a confidence level satisfying the threshold confidence level, in the ranked order, when presenting the plurality of transcription results for the portion of the transcription, with each transcription result of the at least two transcription results presented with the confidence level for the transcription result.
  - 16. The computing device of claim 13, wherein the processor is configured to further execute the specific computer-executable instructions to at least transmit the first transcription result for the portion of the transcription to the remote computing system.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Amazon Technologies, Inc. (Amazon.com, Inc.)
Original Assignee
Amazon Technologies, Inc. (Amazon.com, Inc.)
Inventors
White, Marc, Jablokov, Igor Roditis, Jablokov, Victor Roman
Primary Examiner(s)
ABEBE, DANIEL DEMELASH

Application Number

US15/201,188
Publication Number

US 20170004831A1
Time in Patent Office

648 Days
Field of Search

704235
US Class Current
CPC Class Codes

G06F 3/0236   using selection techniques ...

G10L 15/183   using context dependencies,...

G10L 15/19   Grammatical context, e.g. d...

G10L 15/22   Procedures used during a sp...

G10L 15/26   Speech to text systems G10L...

G10L 15/30   Distributed recognition, e....

G10L 2015/0631   Creating reference template...

Corrective feedback loop for automated speech recognition

First Claim

0 Assignments

0 Petitions

Accused Products

Abstract

Citations

16 Claims

Specification

Solutions

Use Cases

Quick Links

Corrective feedback loop for automated speech recognition

First Claim

0 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

16 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links