System and Method for Latency Reduction for Automatic Speech Recognition Using Partial Multi-Pass Results

US 20110313764A1
Filed: 08/27/2011
Published: 12/22/2011
Est. Priority Date: 12/23/2003
Status: Active Grant

First Claim

Patent Images

1. A method comprising:

transcribing, via a processor, speech data using a first automatic speech recognition pass, which operates at a first transcription rate near real time, to produce a first transcription data and a first word graph;

adapting a model for a second automatic speech recognition pass that uses the first word graph, wherein the second automatic speech recognition pass produces a second transcription data and a second word graph, and wherein the second automatic speech recognition pass is slower than the first automatic speech recognition pass;

displaying at least part of the first transcription data corresponding to a portion of the speech data, prior to transcription of the portion of the speech data by the second automatic speech recognition pass, to yield a displayed part; and

updating the displayed part with at least the second transcription data.

View all claims

5 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A system and method is provided for reducing latency for automatic speech recognition. In one embodiment, intermediate results produced by multiple search passes are used to update a display of transcribed text.

Citations

20 Claims

1. A method comprising:
- transcribing, via a processor, speech data using a first automatic speech recognition pass, which operates at a first transcription rate near real time, to produce a first transcription data and a first word graph;
  
  adapting a model for a second automatic speech recognition pass that uses the first word graph, wherein the second automatic speech recognition pass produces a second transcription data and a second word graph, and wherein the second automatic speech recognition pass is slower than the first automatic speech recognition pass;
  
  displaying at least part of the first transcription data corresponding to a portion of the speech data, prior to transcription of the portion of the speech data by the second automatic speech recognition pass, to yield a displayed part; and
  
  updating the displayed part with at least the second transcription data.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
- - 2. The method of claim 1, wherein the first automatic speech recognition pass operates at real time.
  - 3. The method of claim 1, wherein the first automatic speech recognition pass operates at greater than real time.
  - 4. The method of claim 1, wherein the displaying an indicator that signifies that more additional transcription data is being generated.
  - 5. The method of claim 1, wherein the displayed part changes color upon updating.
  - 6. The method of claim 1, wherein a low confidence portion of the displayed part is distinctly displayed as compared to a high confidence portion of the displayed part.
  - 7. The method of claim 6, wherein the low confidence portion of the displayed part is displayed in a darker shade as compared to the high confidence portion of the displayed data.
  - 8. The method of claim 1, further comprising:
    - adapting an additional model for a third automatic speech recognition pass that uses the second word graph, wherein the third automatic speech recognition pass produces a third transcription data and a third word graph and wherein the third automatic speech recognition pass is slower than the second automatic speech recognition pass; and
      
      updating the displayed part with at least the third transcription data.

9. A system comprising:
- a processor;
  
  a memory storing instructions for controlling the processor to perform steps comprising;
  
  transcribing speech data using a first automatic speech recognition pass, which operates at a first transcription rate near real time, to produce a first transcription data and a first word graph;
  
  adapting a model for a second automatic speech recognition pass that uses the first word graph, wherein the second automatic speech recognition pass produces a second transcription data and a second word graph, and wherein the second automatic speech recognition pass is slower than the first automatic speech recognition pass;
  
  displaying at least part of the first transcription data corresponding to a portion of the speech data, prior to transcription of the portion of the speech data by the second automatic speech recognition pass, to yield a displayed part; and
  
  updating the displayed part with at least the second transcription data.
- View Dependent Claims (10, 11, 12, 13, 14, 16)
- - 10. The system of claim 9, wherein the first automatic speech recognition pass operates at real time.
  - 11. The system of claim 9, wherein the first automatic speech recognition pass operates at greater than real time.
  - 12. The system of claim 9, wherein the displaying an indicator that signifies that more additional transcription data is being generated.
  - 13. The system of claim 9, wherein the displayed part changes color upon updating.
  - 14. The system of claim 9, wherein a low confidence portion of the displayed part is distinctly displayed as compared to a high confidence portion of the displayed part.
  - 16. The system of claim 9, further comprising:
    - adapting an additional model for a third automatic speech recognition pass that uses the second word graph, wherein the third automatic speech recognition pass produces a third transcription data and a third word graph and wherein the third automatic speech recognition pass is slower than the second automatic speech recognition pass; and
      
      updating the displayed part with at least the third transcription data.

15. The system of claim 15, wherein the low confidence portion of the displayed part is displayed in a darker shade as compared to the high confidence portion of the displayed data.

17. A non-transitory computer-readable storage medium storing instructions which, when executed by a computing device, cause the computing device to perform steps comprising:
- transcribing speech data using a first automatic speech recognition pass, which operates at a first transcription rate near real time, to produce a first transcription data and a first word graph;
  
  adapting a model for a second automatic speech recognition pass that uses the first word graph, wherein the second automatic speech recognition pass produces a second transcription data and a second word graph, and wherein the second automatic speech recognition pass is slower than the first automatic speech recognition pass;
  
  displaying at least part of the first transcription data corresponding to a portion of the speech data, prior to transcription of the portion of the speech data by the second automatic speech recognition pass, to yield a displayed part; and
  
  updating the displayed part with at least the second transcription data.
- View Dependent Claims (18, 19, 20)
- - 18. The non-transitory computer-readable storage medium of claim 17, wherein the first automatic speech recognition pass operates at real time.
  - 19. The non-transitory computer-readable storage medium of claim 17, wherein the first automatic speech recognition pass operates at greater than real time.
  - 20. The non-transitory computer-readable storage medium of claim 17, wherein the displaying an indicator that signifies that more additional transcription data is being generated.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
AT&T Intellectual Property I LP (AT&T, Inc.)
Inventors
Bacchiani, Michiel Adriaan Unico, Amento, Brian Scott

Granted Patent

US 8,209,176 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/235
CPC Class Codes

G10L 15/32 Multiple recognisers used i...

System and Method for Latency Reduction for Automatic Speech Recognition Using Partial Multi-Pass Results

First Claim

5 Assignments

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

System and Method for Latency Reduction for Automatic Speech Recognition Using Partial Multi-Pass Results

First Claim

5 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links