Speech interface system and method for control and interaction with applications on a computing system
First Claim
1. A speech processing method, comprising:
- determining a set of available instructions;
determining data structures corresponding to the available instructions;
processing a natural language speech input representing at least one instruction with respect to the determined data structures;
determining if the natural language speech input likely represents an instruction;
determining a completeness and an ambiguity of the likely represented instruction with respect to the data structures, and if the likely represented instruction is too ambiguous or incomplete for proper execution, prompting for further speech input to reduce ambiguity or incompleteness;
targeting a likely represented instruction which is sufficiently complete and unambiguous for proper execution to one of a plurality of respective applications;
preserving a system state prior to at least partially executing the sufficiently complete and unambiguous instruction;
executing the sufficiently complete and unambiguous instruction by the one of the plurality of applications; and
restoring the preserved system state after execution of the sufficiently complete and unambiguous instruction.
2 Assignments
0 Petitions
Accused Products
Abstract
A speech processing system which exploits statistical modeling and formal logic to receive and process speech input, which may represent data to be received, such as dictation, or commands to be processed by an operating system, application or process. A command dictionary and dynamic grammars are used in processing speech input to identify, disambiguate and extract commands. The logical processing scheme ensures that putative commands are complete and unambiguous before processing. Context sensitivity may be employed to differentiate data and commands. A multi faceted graphic user interface may be provided for interaction with a user to speech enable interaction with applications and processes that do not necessarily have native support for speech input.
325 Citations
20 Claims
-
1. A speech processing method, comprising:
-
determining a set of available instructions; determining data structures corresponding to the available instructions; processing a natural language speech input representing at least one instruction with respect to the determined data structures; determining if the natural language speech input likely represents an instruction; determining a completeness and an ambiguity of the likely represented instruction with respect to the data structures, and if the likely represented instruction is too ambiguous or incomplete for proper execution, prompting for further speech input to reduce ambiguity or incompleteness; targeting a likely represented instruction which is sufficiently complete and unambiguous for proper execution to one of a plurality of respective applications; preserving a system state prior to at least partially executing the sufficiently complete and unambiguous instruction; executing the sufficiently complete and unambiguous instruction by the one of the plurality of applications; and restoring the preserved system state after execution of the sufficiently complete and unambiguous instruction. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A speech processing method, comprising:
-
receiving a natural language speech input representing one or more instructions and one or more words; analyzing the natural language speech for contextual indicia to distinguish between the one or more instructions, instructing a device at take automated action, and the one or more words intended as data; determining whether a respective instruction is sufficiently complete to permit at least partial execution, or whether additional input is required to permit at least partial execution; at least partially executing the sufficiently complete respective instruction; and passing the one or more words intended as data to a data sink, wherein the one or more instructions are targeted to one of a plurality of respective applications, further comprising preserving a respective system state prior to at least partially executing the sufficiently complete respective instruction; and
restoring a stored system state after execution of the sufficiently complete respective instruction. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A speech processing apparatus, comprising:
-
an input port configured to receive a natural language speech input representing one or more instructions and one or more words; at least one processor, configured to; analyze the natural language speech for contextual indicia to distinguish between the one or more instructions, instructing a device at take automated action, and the one or more words intended as data; determine whether a respective instruction is sufficiently complete to permit at least partial execution, or whether additional input is required to permit at least partial execution; preserve a respective system state prior to at least partial execution of the sufficiently complete respective instruction; target the sufficiently complete respective instruction to one of a plurality of respective applications; at least partially execute the sufficiently complete respective instruction by the one of the plurality of respective applications; pass the one or more words intended as data to a data sink; and restore the preserved system state after the at least partial execution of the sufficiently complete respective instruction; and a memory configured to store information selectively based on the at least partially executed respective instruction. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification