Multiple speech recognition engines

US 7,340,395 B2
Filed: 04/23/2004
Issued: 03/04/2008
Est. Priority Date: 04/23/2004
Status: Active Grant

First Claim

Patent Images

1. A system comprising:

a plurality of available speech recognition engines configured to interpret spoken input, if selected as a chosen speech recognition engine;

a user interface configured to receive the spoken input; and

a speech recognition engine manager configured to;

dynamically build heuristics relating to a quantity of misrecognitions of past spoken input of a particular user over time by the available speech recognition engines,track user preference information over time relating to a preferred speech recognition engine of the particular user for use in an application; and

select, based on the dynamically built heuristics, the tracked user preference information, and further based on receiving spoken input via the user interface, the chosen speech recognition engine from amongst the available speech recognition engines.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A system having multiple speech recognition engines, each operable to recognize spoken data, is described. A speech recognition engine manager detects the speech recognition engines, and selects at least one for recognizing spoken input from a user, via a user interface. In this way, a speech recognition engine that is particularly suited to a current environment may be selected. For example, a speech recognition engine that is particularly suited for, or preferred by, the user may be selected, or a speech recognition engine that is particularly suited for a particular type of interface, interface element, or application, may be selected. Multiple ones of the speech recognition engines may be selected and simultaneously maintained in an active state, by maintaining a session associated with each of the engines. Accordingly, users'"'"' experience with voice applications may be enhanced, and, in particular, users with physical disabilities may more easily interact with software applications.

61 Citations

View as Search Results

17 Claims

1. A system comprising:
- a plurality of available speech recognition engines configured to interpret spoken input, if selected as a chosen speech recognition engine;
  
  a user interface configured to receive the spoken input; and
  
  a speech recognition engine manager configured to;
  
  dynamically build heuristics relating to a quantity of misrecognitions of past spoken input of a particular user over time by the available speech recognition engines,track user preference information over time relating to a preferred speech recognition engine of the particular user for use in an application; and
  
  select, based on the dynamically built heuristics, the tracked user preference information, and further based on receiving spoken input via the user interface, the chosen speech recognition engine from amongst the available speech recognition engines.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
- - 2. The system of claim 1 wherein the speech recognition engine manager further comprises a speech recognition engine detector operable to detect the plurality of available speech recognition engines.
  - 3. The system of claim 1 wherein the speech recognition engine selector is operable to select the chosen speech recognition engine based on selection information.
  - 4. The system of claim 3 wherein the selection information further comprises the dynamically built heuristics.
  - 5. The system of claim 4 wherein the dynamically built heuristics relate to a type of speech recognition associated with the plurality of available speech recognition engines.
  - 6. The system of claim 3 wherein the selection information includes information related to the user interface with respect to the plurality of available speech recognition engines.
  - 7. The system of claim 1 wherein the speech recognition engine manager uses application program interface (API) libraries to interact with the plurality of available speech recognition engines.
  - 8. The system of claim 1 wherein the speech recognition engine manager further comprises a session manager operable to manage a first session with a first speech recognition engine and a second session with a second speech recognition engine, where the first and second session overlap in time.

9. A computer-implemented method comprising:
- determining available speech recognition engines;
  
  dynamically building heuristics relating to a quantity of misrecognitions of past spoken input of a particular user over time by the available speech recognition engines;
  
  tracking user preference information over time relating to a preferred speech recognition engine of the particular user for use in an application;
  
  receiving spoken input via a user interface;
  
  selecting, based on the dynamically built heuristics, the tracked user preference information, and further based on receiving spoken input via the user interface, a chosen recognition engine from amongst the available speech recognition engines; and
  
  interpreting the spoken input using the chosen speech recognition engine.
- View Dependent Claims (10, 11, 12, 13)
- - 10. The method of claim 9 wherein selecting the chosen speech recognition engine includes accessing selection information associated with the particular user, the user interface, or the available speech recognition engines.
  - 11. The method of claim 9 further comprising:
    - selecting a plurality of the available speech recognition engines; and
      
      maintaining a session for each of the selected plurality of available speech recognition engines.
  - 12. The method of claim 9 comprising forwarding the interpreted spoken input to the user interface via a voice-enabled portal, so that the interpreted spoken input is displayed as text in association with the user interface.
  - 13. The method of claim 9, wherein the chosen speech recognition engine is selected if the spoken input is received from the particular user.

14. A computer program product, tangibly embodied in a machine readable medium, the computer program product comprising instructions that, when read by a machine, operate to cause data processing apparatus to:
- determine available speech recognition engines;
  
  dynamically build heuristics relating to a quantity of misrecognitions of past spoken input of a particular user over time by the available speech recognition engines;
  
  track user preference information over time relating to a preferred speech recognition engine of the particular user for use in an application;
  
  receive spoken input via a user interface;
  
  select, based on the dynamically built heuristics, the tracked user preference information, and further based on receiving spoken input via the user interface, a chosen speech recognition engine from amongst the available speech recognition engines; and
  
  interpret the spoken input using the chosen speech recognition engine.
- View Dependent Claims (15, 16, 17)
- - 15. The computer program product of claim 14 wherein selecting the chosen speech recognition engine includes accessing selection information associated with the particular user, the user interface, or the available speech recognition engines.
  - 16. The computer program product of claim 14 further comprising instructions that, when read by a machine, operate to cause data processing apparatus to:
    - select a plurality of the available speech recognition engines; and
      
      maintain a session for each of the selected plurality of available speech recognition engines.
  - 17. The computer program product of claim 14 further comprising instructions that, when read by a machine, operate to cause data processing apparatus to forward the interpreted spoken input to the user interface via a voice-enabled portal, so that the interpreted spoken input is displayed as text in association with the user interface.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
SAP SE
Original Assignee
SAP AG (SAP SE)
Inventors
Gurram, Rama, James, Frances
Primary Examiner(s)
MCFADDEN, SUSAN IRIS

Application Number

US10/830,313
Publication Number

US 20050240404A1
Time in Patent Office

1,411 Days
Field of Search

704/231
US Class Current

704/231
CPC Class Codes

G10L 15/32 Multiple recognisers used i...

Multiple speech recognition engines

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

61 Citations

17 Claims

Specification

Use Cases

Quick Links

Others

Multiple speech recognition engines

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

61 Citations

17 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others