Middleware layer between speech related applications and engines
First Claim
Patent Images
1. A multi-process speech recognition middleware layer of computer-readable instructions embedded on a computer-readable medium, the instructions being configured to, when executed, facilitate communication between a speech recognition (SR) engine and a plurality of speech recognition applications, the middleware layer comprising:
- a first process associated with a first speech recognition application including;
a first context object having an application interface to enable application control of a first plurality of attributes of the speech recognition engine, the first context object also including an engine interface; and
a first grammar object storing a first grammar used by the first process to support a speech recognition functionality associated with the first speech recognition application; and
a second process associated with a second speech recognition application that is different than the first speech application, wherein the second process includes;
a second context object having an application interface to enable application control of a first plurality of attributes of the speech recognition engine, the second context object also including an engine interface; and
a second grammar object storing a second grammar used by the second process to support a speech recognition functionality associated with the second speech recognition application; and
a server process configured to receive result information provided by the SR engine and provide the result information to the first or second process, to which the result information belongs.
1 Assignment
0 Petitions
Accused Products
Abstract
The present invention provides an application-independent and engine-independent middleware layer (204) between applications (202) and engines (206, 208). The middleware provides speech-related services to both applications (202) and engines (206, 208), thereby making it far easier for application vendors and engine vendors to bring their technology to consumers.
33 Citations
9 Claims
-
1. A multi-process speech recognition middleware layer of computer-readable instructions embedded on a computer-readable medium, the instructions being configured to, when executed, facilitate communication between a speech recognition (SR) engine and a plurality of speech recognition applications, the middleware layer comprising:
-
a first process associated with a first speech recognition application including; a first context object having an application interface to enable application control of a first plurality of attributes of the speech recognition engine, the first context object also including an engine interface; and a first grammar object storing a first grammar used by the first process to support a speech recognition functionality associated with the first speech recognition application; and a second process associated with a second speech recognition application that is different than the first speech application, wherein the second process includes; a second context object having an application interface to enable application control of a first plurality of attributes of the speech recognition engine, the second context object also including an engine interface; and a second grammar object storing a second grammar used by the second process to support a speech recognition functionality associated with the second speech recognition application; and a server process configured to receive result information provided by the SR engine and provide the result information to the first or second process, to which the result information belongs. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A multi-voice speech synthesis middleware layer of computer-readable instructions embedded on a computer-readable medium, the instructions being configured to, when executed, facilitate communication between one or more applications and a plurality of text-to-speech (TTS) engines, the middleware layer comprising:
at least a first voice object having an application interface configured to receive TTS engine attribute information from the application and to instantiate first and second TTS engines based on the TTS attribute information, to receive a speak request requesting at least one of the TTS engines to speak a message, and to receive priority information associated with each speak request indicative of a precedence each speak request is to take, and wherein the first voice object has an engine interface configured to call a specified one of the first and second TTS engines to synthesize input data. - View Dependent Claims (9)
Specification