WEB-BASED SPEECH RECOGNITION WITH SCRIPTING AND SEMANTIC OBJECTS
First Claim
1. A speech application system, comprising:
- A. a speech recognition (SR) system configured to receive an audio input and generate a set of semantic data representing a plurality of valid interpretations of said audio input;
B. a set of speech application scripts, loaded at the SR system and configured to task said SR system, said set of application scripts defining a context;
C. a semantic data evaluator, configured to receive said set of semantic data and said context and, as a function thereof, to generate a linguistic result corresponding to said audio input, and to return said linguistic result to said set of application scripts; and
D. a set of reusable object oriented interfaces local to the SR system, said interfaces configured to interface said one or more of said set of application scripts with said SR system.
8 Assignments
0 Petitions
Accused Products
Abstract
The present invention is a system and method for creating and implementing transactional speech applications (SAs) using Web technologies, without reliance on server-side standard or custom services. A transactional speech application may be any application that requires interpretation of speech in conjunction with a speech recognition (SR) system, such as, for example, consumer survey systems. A speech application in accordance with the present invention is represented within a Web page, as an application script that interprets semantic objects according to a context. Any commonly known scripting language can be used to write the application script, such as JavaScript (or ECMAScript), PerlScript, and VBscript. The present invention is “Web-based” to the extent that it implements Web technologies, but it need not include or access the World Wide Web.
-
Citations
18 Claims
-
1. A speech application system, comprising:
-
A. a speech recognition (SR) system configured to receive an audio input and generate a set of semantic data representing a plurality of valid interpretations of said audio input; B. a set of speech application scripts, loaded at the SR system and configured to task said SR system, said set of application scripts defining a context; C. a semantic data evaluator, configured to receive said set of semantic data and said context and, as a function thereof, to generate a linguistic result corresponding to said audio input, and to return said linguistic result to said set of application scripts; and D. a set of reusable object oriented interfaces local to the SR system, said interfaces configured to interface said one or more of said set of application scripts with said SR system. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A speech application system comprising:
-
A. a speech recognition (SR) system hosted on a first computer and configured to receive an audio input from an input device and to generate one or more semantic objects representing a plurality of valid interpretations of said audio input; B. a Web page loaded on said first computer, from a second computer, said Web page including an application script comprising a set of speech application functionality and configured to interact with said input device via said SR system, wherein said speech application is configured to conduct speech application sessions without accessing said second computer; C. a set of reusable object oriented interfaces local to the first computer, said interfaces including; (1) one or more interface objects configured to facilitate access by said application script to standard services of said first computer; and (2) an interface configured to facilitate access to and control of said SR system by said application script; and D. a semantic object evaluator, configured to generate from said semantic objects, as a function of said context, a single interpretation of said audio input and to return said single interpretation to said application script. - View Dependent Claims (12, 13)
-
-
14. A speech application script included within a Web page, and configured to interact with a SR system hosted on a first computer and configured to receive an audio input and to generate one or more semantic objects representing a plurality of valid interpretations of said audio input, said first computer also including a plurality of interfaces objects and a semantic object evaluator configured to generate from said one or more semantic objects a single interpretation of said audio input as a function of a context, said speech application script comprising:
-
A. a context definition; B. a link to said semantic object evaluator; C. a link to said SR system, via an interface object, from said plurality of interface objects; D. a set of control functionality comprising; (1) a session manager configured to generate user prompts and to determine a next action as a function of said single interpretation; (2) a SR system controller, configured to task said SR system; and (3) a communication manager, configured to manage interaction with said input device via said SR system, wherein said speech application script is loaded on said first computer from a second computer and said speech application is configured to conduct speech application sessions without accessing said second computer. - View Dependent Claims (15, 16)
-
-
17. A method of performing a speech application session, wherein a SR system is hosted on a first computer and includes a means to receive an audio input, said method comprising:
-
A. receiving said audio input by said SR system; B. loading a Web page including an application script on said first computer, said application script including a set of functionality configured to manage a speech application session and control said SR system, without accessing functionality from a second computer; C. establishing a set of standard interfaces between said SR system and said application script, including establishing a semantic evaluator; D. in response to tasking by said application script, generating by said SR system one or more semantic objects representing all possible interpretations of said audio input; E. in response to receiving a context defined by said application script, determining by said semantic evaluator a single semantic interpretation from said one or more semantic objects; and F. determining a next action by said application script as a function of said single semantic interpretation.
-
-
18. A method of configuring a speech application system, wherein a SR system is hosted on a first computer and includes a means to receive an audio input, said method comprising:
-
A. generating a Web page on a second computer; B. defining a speech application script including a set of functionality configured to manage a speech application session and control said SR system, without accessing functionality from said second computer; C. integrating said application script into said Web page; D. loading said Web page, including said application script, from said second computer to said first computer; and E. establishing a set of standard interfaces between said application script and said SR system.
-
Specification