Speech recognition using loosely coupled components

US 9,208,786 B2
Filed: 03/03/2015
Issued: 12/08/2015
Est. Priority Date: 06/13/2011
Status: Active Grant

- Alert
- Pin

First Claim

Patent Images

1. A system comprising:

an audio capture component, the audio capture component comprising means for capturing a first audio signal representing first speech of a user to produce a first captured audio signal;

a speech recognition processing component comprising means for performing automatic speech recognition on the first captured audio signal to produce first speech recognition results;

a first result processing component, the first result processing component comprising first means for processing the first speech recognition results to produce first result output;

a second result processing component, the second result processing component comprising second means for processing the first speech recognition results to produce second result output;

a context sharing component comprising means for identifying a first one of the first and second result processing components as being associated with a first context of the user at a first time, the context sharing component further comprising;

means for identifying a list of at least one result processing component authorized for use on behalf of the user at the first time; and

means for determining that the at least one result processing component in the list is associated with the context of the user at the first time; and

speech recognition result provision means for providing the first speech recognition results to the identified first one of the first and second result processing components.

View all claims

4 Assignments

Timeline View

Assignment View

Litigations

1 Petition

Accused Products

Abstract

An automatic speech recognition system includes an audio capture component, a speech recognition processing component, and a result processing component which are distributed among two or more logical devices and/or two or more physical devices. In particular, the audio capture component may be located on a different logical device and/or physical device from the result processing component. For example, the audio capture component may be on a computer connected to a microphone into which a user speaks, while the result processing component may be on a terminal server which receives speech recognition results from a speech recognition processing server.

4 Citations

42 Claims

1. A system comprising:
- an audio capture component, the audio capture component comprising means for capturing a first audio signal representing first speech of a user to produce a first captured audio signal;
  
  a speech recognition processing component comprising means for performing automatic speech recognition on the first captured audio signal to produce first speech recognition results;
  
  a first result processing component, the first result processing component comprising first means for processing the first speech recognition results to produce first result output;
  
  a second result processing component, the second result processing component comprising second means for processing the first speech recognition results to produce second result output;
  
  a context sharing component comprising means for identifying a first one of the first and second result processing components as being associated with a first context of the user at a first time, the context sharing component further comprising;
  
  means for identifying a list of at least one result processing component authorized for use on behalf of the user at the first time; and
  
  means for determining that the at least one result processing component in the list is associated with the context of the user at the first time; and
  
  speech recognition result provision means for providing the first speech recognition results to the identified first one of the first and second result processing components.
- View Dependent Claims (2)
- - 2. The system of claim 1, wherein:
    - the audio capture component further comprises means for capturing a second audio signal representing second speech of the user to produce a second captured audio signal;
      
      the speech recognition processing component further comprises means for performing automatic speech recognition on the second captured audio signal to produce second speech recognition results;
      
      the context sharing component further comprises means for identifying a second one of the first and second result processing components as being associated with a second context of the user at a second time, wherein the second one of the first and second result processing components differs from the first one of the first and second result processing components; and
      
      wherein the speech recognition result provision means further comprises means for providing the second speech recognition results to the identified second one of the first and second result processing components.

3. A computer-implemented method for use with a system:
- wherein the system comprises;
  
  an audio capture component;
  
  a speech recognition processing component;
  
  a first result processing component;
  
  a second result processing component;
  
  a context sharing component; and
  
  speech recognition result provision means;
  
  wherein the method comprises;
  
  (A) using the audio capture component to capture a first audio signal representing first speech of a user to produce a first captured audio signal;
  
  (B) using the speech recognition processing component to perform automatic speech recognition on the first captured audio signal to produce first speech recognition results;
  
  (C) using the first result processing component to process the first speech recognition results to produce first result output;
  
  (D) using second result processing component to process the first speech recognition results to produce second result output;
  
  (E) using the context sharing component to identify a first one of the first and second result processing components as being associated with a first context of the user at a first time, wherein using the context sharing component to identify further comprises;
  
  identifying a list of at least one result processing component authorized for use on behalf of the user at the first time; and
  
  determining that the at least one result processing component in the list is associated with the context of the user at the first time; and
  
  (F) using the speech recognition result provision means to provide the first speech recognition results to the identified first one of the first and second result processing components.
- View Dependent Claims (4)
- - 4. The method of claim 3, further comprising:
    - (G) using the audio capture component to capture a second audio signal representing second speech of the user to produce a second captured audio signal;
      
      (H) using the speech recognition processing component to perform automatic speech recognition on the second captured audio signal to produce second speech recognition results;
      
      (I) using the context sharing component to identify a second one of the first and second result processing components as being associated with a second context of the user at a second time, wherein the second one of the first and second result processing components differs from the first one of the first and second result processing components; and
      
      (J) using the speech recognition result provision means to provide the second speech recognition results to the identified second one of the first and second result processing components.

5. A system comprising:
- a first audio capture component comprising first means for capturing a first audio signal representing speech of a user to produce a first captured audio signal;
  
  a first speech recognition processing component comprising first means for performing automatic speech recognition on the first captured audio signal to produce first speech recognition results;
  
  a first result processing component comprising first means for processing the first speech recognition results to produce first result output; and
  
  a context sharing component comprising means for dynamically coupling at least two of the first audio capture component, the first speech recognition processing component, and the first result processing component to each other at a first time, wherein the means for dynamically coupling further comprises;
  
  means for identifying a list of at least one result processing component authorized for use on behalf of the user at the first time; and
  
  means for determining that the at least one result processing component in the list is associated with the context of the user at the first time.
- View Dependent Claims (6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23)
- - 6. The system of claim 5, further comprising a first device, wherein the first device comprises the first audio capture component, and a second device, wherein the second device includes the first result processing component, wherein the first device is distinct from the second device.
  - 7. The system of claim 5, wherein the context sharing component comprises means for dynamically coupling the first audio capture component to the first speech recognition processing component.
  - 8. The system of claim 7:
    - further comprising a second audio capture component comprising second means for capturing a second audio signal representing speech of a user to produce a second captured audio signal; and
      
      wherein the context sharing component further comprises means for dynamically coupling the second audio capture component to the first speech recognition processing component.
  - 9. The system of claim 7, further comprising:
    - means for providing the first captured audio signal to the first speech recognition processing component after dynamically coupling the first audio capture component to the first speech recognition processing component.
  - 10. The system of claim 9, wherein the context sharing component comprises the means for providing the first captured audio signal.
  - 11. The system of claim 9, wherein the first audio capture component comprises the means for providing the first captured audio signal.
  - 12. The system of claim 9, wherein the first speech recognition processing component comprises the means for providing the first captured audio signal.
  - 13. The system of claim 9, wherein the means for providing comprises means for providing the first captured audio signal to the first speech recognition processing component in real-time.
  - 14. The system of claim 5, wherein the context sharing component comprises means for dynamically coupling the first audio capture component to the first result processing component.
  - 15. The system of claim 5, wherein the context sharing component comprises means for dynamically coupling the first speech recognition processing component to the first result processing component.
  - 16. The system of claim 15:
    - further comprising a second speech recognition processing component comprising second means for performing automatic speech recognition on the first captured audio signal to produce second speech recognition results; and
      
      wherein the context sharing component further comprises means for dynamically coupling the first audio capture component to the second speech recognition processing component.
  - 17. The system of claim 15, further comprising:
    - means for providing the first speech recognition results to the first result processing component after dynamically coupling the first speech recognition processing component to the first result processing component.
  - 18. The system of claim 17, wherein the context sharing component comprises the means for providing the first speech recognition results.
  - 19. The system of claim 17, wherein the first speech recognition processing component comprises the means for providing the first speech recognition results.
  - 20. The system of claim 17, wherein the first result processing component comprises the means for providing the first speech recognition results.
  - 21. The system of claim 17, wherein the means for providing comprises means for providing the first speech recognition results to the first result processing component in real-time.
  - 22. The system of claim 5, wherein the context sharing component comprises means for dynamically coupling the first audio capture component to the first speech recognition processing component and for dynamically coupling the first speech recognition processing component to the first result processing component.
  - 23. The system of claim 5, wherein the means for dynamically coupling comprises means for dynamically coupling at least two of the first audio capture component, the first speech recognition processing component, and the first result processing component to each other at run-time.

24. A computer-implemented method for use with a system:
- wherein the system comprises;
  
  a first audio capture component;
  
  a first speech recognition processing component;
  
  a first result processing component; and
  
  a context sharing component;
  
  wherein the method comprises;
  
  (A) using the first audio capture component to capture a first audio signal representing speech of a user to produce a first captured audio signal;
  
  (B) using the first speech recognition processing component to perform automatic speech recognition on the first captured audio signal to produce first speech recognition results;
  
  (C) using the first result processing component to process the first speech recognition results to produce first result output; and
  
  (D) using the context sharing component to dynamically couple at least two of the first audio capture component, the first speech recognition processing component, and the first result processing component to each other at a first time, wherein using the context sharing component to dynamically couple further comprises;
  
  identifying a list of at least one result processing component authorized for use on behalf of the user at the first time; and
  
  determining that the at least one result processing component in the list is associated with the context of the user at the first time.
- View Dependent Claims (25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42)
- - 25. The method of claim 24:
    - wherein the system further comprises;
      
      a first device, wherein the first device comprises the first audio capture component; and
      
      a second device, wherein the second device includes the first result processing component;
      
      wherein the first device is distinct from the second device.
  - 26. The method of claim 24, wherein (D) comprises dynamically coupling the first audio capture component to the first speech recognition processing component.
  - 27. The method of claim 26, further comprising:
    - (E) providing the first captured audio signal to the first speech recognition processing component after dynamically coupling the first audio capture component to the first speech recognition processing component.
  - 28. The method of claim 27, wherein (E) is performed by the context sharing component.
  - 29. The method of claim 27, wherein (E) is performed by the first audio capture component.
  - 30. The system of claim 27, wherein (E) is performed by the first speech recognition processing component.
  - 31. The method of claim 27, wherein (E) comprises providing the first captured audio signal to the first speech recognition processing component in real-time.
  - 32. The method of claim 24:
    - wherein the system further comprises a second audio capture component;
      
      wherein the method further comprises;
      
      (E) using the second audio capture component to capture a second audio signal representing speech of a user to produce a second captured audio signal; and
      
      wherein (D) comprises dynamically coupling the second audio capture component to the first speech recognition processing component.
  - 33. The method of claim 24, wherein (D) comprises dynamically coupling the first audio capture component to the first result processing component.
  - 34. The method of claim 24, wherein (D) comprises dynamically coupling the first speech recognition processing component to the first result processing component.
  - 35. The method of claim 34:
    - wherein the system further comprises a second speech recognition processing component; and
      
      wherein the method further comprises;
      
      (E) performing automatic speech recognition on the first captured audio signal to produce second speech recognition results; and
      
      wherein (D) comprises dynamically coupling the first audio capture component to the second speech recognition processing component.
  - 36. The method of claim 34, further comprising:
    - (F) providing the first speech recognition results to the first result processing component after dynamically coupling the first speech recognition processing component to the first result processing component.
  - 37. The method of claim 36, wherein (F) is performed by the context sharing component.
  - 38. The method of claim 36, wherein (F) is performed by the first speech recognition processing component.
  - 39. The method of claim 36, wherein (F) is performed by the first result processing component.
  - 40. The method of claim 36, wherein (F) comprises providing the first speech recognition results to the first result processing component in real-time.
  - 41. The method of claim 24, wherein (D) comprises dynamically coupling the first audio capture component to the first speech recognition processing component and dynamically coupling the first speech recognition processing component to the first result processing component.
  - 42. The method of claim 24, wherein (D) comprises dynamically coupling at least two of the first audio capture component, the first speech recognition processing component, and the first result processing component to each other at run-time.

Specification

Resources

Litigation Campaign Assessment

Litigation Data

Current Assignee
3M Health Information Systems (3M Company)
Original Assignee
MModal IP LLC (3M Company)
Inventors
Koll, Detlef, Finke, Michael
Primary Examiner(s)
Saint Cyr, Leonard

Application Number

US14/636,774
Publication Number

US 20150179172A1
Time in Patent Office

280 Days
Field of Search

704/231, 704/246, 704/247, 704/251, 704/252
US Class Current

1/1
CPC Class Codes

G10L 15/22   Procedures used during a sp...

G10L 15/30   Distributed recognition, e....

G10L 2015/228   of application context

Speech recognition using loosely coupled components

First Claim

4 Assignments

Litigations

1 Petition

Accused Products

Abstract

4 Citations

42 Claims

Specification

Use Cases

Quick Links

Others

Speech recognition using loosely coupled components

First Claim

4 Assignments

Subscription Required

Subscription Required

Litigations

1 Petition

Subscription Required

Accused Products

Subscription Required

Abstract

4 Citations

42 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others