System and process for voice-controlled information retrieval
First Claim
Patent Images
1. A system for voice-controlled information retrieval using a voice transceiver, comprising:
- a voice transceiver including a speech engine, wherein the voice transceiver is operable to execute a conversation template, the conversation template comprising a script of tagged instructions comprising voice prompts and expected user responses; and
a Web browser remote from the voice transceiver, the Web browser operable to obtain information content from a network;
wherein the voice transceiver obtains and processes one or more voice commands identifying information to be retrieved by the Web browser, and wherein the voice transceiver transmits a remote method invocation requesting the identified information content to an applet process associated with the Web browser, wherein the voice transceiver transmits navigation commands for controlling navigation actions of the Web browser, wherein the applet process is configured to invoke navigation commands in the Web browser responsive to the receipt of navigation commands received from the voice transceiver; and
wherein the applet process retrieves the identified information content on the Web browser responsive to the remote method invocation.
1 Assignment
0 Petitions
Accused Products
Abstract
A system and process for voice-controlled information retrieval. A conversation template is executed. The conversation template includes a script of tagged instructions including voice prompts and information content. A voice command identifying information content to be retrieved is processed. A remote method invocation is sent requesting the identified information content to an applet process associated with a Web browser. The information content is retrieved on the Web browser responsive to the remote method invocation.
301 Citations
38 Claims
-
1. A system for voice-controlled information retrieval using a voice transceiver, comprising:
-
a voice transceiver including a speech engine, wherein the voice transceiver is operable to execute a conversation template, the conversation template comprising a script of tagged instructions comprising voice prompts and expected user responses; and
a Web browser remote from the voice transceiver, the Web browser operable to obtain information content from a network;
wherein the voice transceiver obtains and processes one or more voice commands identifying information to be retrieved by the Web browser, and wherein the voice transceiver transmits a remote method invocation requesting the identified information content to an applet process associated with the Web browser, wherein the voice transceiver transmits navigation commands for controlling navigation actions of the Web browser, wherein the applet process is configured to invoke navigation commands in the Web browser responsive to the receipt of navigation commands received from the voice transceiver; and
wherein the applet process retrieves the identified information content on the Web browser responsive to the remote method invocation. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
a parser parsing the conversation template to form a set of tokens; and
the voice transceiver interpreting the set of tokens.
-
-
3. A system according to claim 1, wherein the speech engine stores a dynamically compiled speech grammar in the voice transceiver, the dynamically compiled speech grammar comprising a set of voice commands, wherein the speech engine determines a speech event from a voice input device connected to the voice transceiver using the dynamically compiled speech grammar, and wherein the speech engine matches the speech event to one such voice command.
-
4. A system according to claim 3, further comprising:
-
a parser instantiating each tagged instruction; and
the voice transceiver executing the instantiated tagged instruction.
-
-
5. A system according to claim 4, further comprising:
the parser organizing the set of tokens into a hierarchical structure, one such token representing a root of the hierarchical structure.
-
6. A system according to claim 1, wherein the speech engine stores a set of predefined voice commands, and wherein the voice transceiver performs an action responsive to a selection of one such predefined voice command.
-
7. The system as recited in claim 1 further comprising a telephone in communication with the voice transceiver, the telephone operable to transmit the one or more voice commands identifying information to be retrieved by the Web browser to the voice transceiver.
-
8. The system as recited in claim 7, wherein the telephone is a mobile telephone.
-
9. A process for voice-controlled information retrieval using a voice transceiver, comprising:
-
executing a conversation template, the conversation template comprising a script of tagged instructions comprising voice prompts and expected user responses;
processing a voice command identifying information content to be retrieved;
transmitting navigation commands for controlling navigation actions of the Web browser;
processing received navigation commands in a Web browser responsive to the receipt of navigation commands received from the voice transceiver;
sending a remote method invocation requesting the identified information content to an applet process associated with the Web browser; and
retrieving the identified information content on the Web browser responsive to the remote method invocation. - View Dependent Claims (10, 11, 12, 13, 14)
parsing the conversation template to form a set of tokens; and
interpreting the set of tokens.
-
-
11. A process according to claim 9, the operation of receiving a voice command further comprising:
-
storing a dynamically compiled speech grammar in the voice transceiver, the dynamically compiled speech grammar comprising a set of voice commands;
determining a speech event from a voice input device connected to the voice transceiver using the dynamically compiled speech grammar; and
matching the speech event to one such voice command.
-
-
12. A process according to claim 11, further comprising:
-
instantiating each tagged instruction; and
executing the instantiated tagged instruction.
-
-
13. A process according to claim 12, further comprising:
organizing the set of tokens into a hierarchical structure, one such token representing a root of the hierarchical structure.
-
14. A process according to claim 9, further comprising:
-
storing a set of predefined voice commands; and
performing an action responsive to a selection of one such predefined voice command.
-
-
15. A computer-readable storage medium holding code for voice-controlled information retrieval using a voice transceiver, comprising:
-
a voice transceiver including a speech engine, wherein the voice transceiver is operable to execute a conversation template, the conversation template comprising a script of tagged instructions comprising voice prompts and expected user responses; and
a Web browser remote from the voice transceiver, the Web browser operable to obtain information content from a network;
wherein the voice transceiver obtains and processes one or more voice commands identifying information to be retrieved by the Web browser, and wherein the voice transceiver transmits a remote method invocation requesting the identified information content to an applet process associated with the Web browser, wherein the voice transceiver is configured to transmit navigation commands for controlling navigation actions of the Web browser, wherein the applet process is configured to invoke navigation commands to in the Web browser responsive to the receipt of navigation commands received from the voice transceiver; and
wherein the applet process retrieves the identified information content on the Web browser responsive to the remote method invocation. - View Dependent Claims (16, 17, 18, 19, 20)
a parser parsing the conversation template to form a set of tokens; and
the voice transceiver interpreting the set of tokens.
-
-
17. A system according to claim 15, wherein the speech engine stores a dynamically compiled speech grammar in the voice transceiver, the dynamically compiled speech grammar comprising a set of voice commands, wherein the speech engine determines a speech event from a voice input device connected to the voice transceiver using the dynamically compiled speech grammar, and wherein the speech engine matches the speech event to one such voice command.
-
18. A system according to claim 17, further comprising:
-
a parser instantiating each tagged instruction; and
the voice transceiver executing the instantiated tagged instruction.
-
-
19. A system according to claim 18, further comprising:
the parser organizing the set of tokens into a hierarchical structure, one such token representing a root of the hierarchical structure.
-
20. A system according to claim 15, wherein the speech engine stores a set of predefined voice commands, and wherein the voice transceiver performs an action responsive to a selection of one such predefined voice command.
-
21. A system for retrieving Web content onto a browser running on a remote client using a voice transceiver, comprising:
-
a storage device storing a conversation template on a server, the conversation template comprising a script including instruction tags for voice commands and voice prompts;
a voice transceiver receiving the conversation template and including;
a parser parsing the instruction tags from the script to form a set of interrelated tokens and instantiating an object corresponding to each token;
an interpreter interpreting the set of tokens by executing the object instance corresponding to each token; and
a speech engine receiving a voice command on the voice transceiver from a user, wherein the voice transceiver is operable to send a remote invocation identifying Web content to be retrieved, wherein the voice transceiver is operable to send navigation commands to modify the content communicated by the browser; and
a remote client interconnected to the server and the voice transceiver via a network, the remote client including an applet associated with the browser running on the remote client, the applet operable to request Web content from the server responsive to the remote method invocation, and wherein the applet is operable to invoke navigation commands on the browser upon receipt of at least one navigation command. - View Dependent Claims (22, 23, 24, 25, 26)
the storage device further comprising storing a document type definition defining a format for the script and acceptable instruction tags; and
the parser further comprising a module parsing the script further comprising validating each instruction tag against the document type definition.
-
-
23. A system according to claim 22, wherein each object instance includes an accessor method, the interpreter further comprises:
-
a module determining those tokens related to each such token by performing the accessor method associated with the token; and
a module interpreting the set of related tokens.
-
-
24. A system according to claim 23, wherein at least one such token comprises a branch instruction token, the interpreter further comprises:
-
a module interrupting the operation of executing the related tokens upon the execution of the branch instruction token; and
a module determining those tokens related to the branch instruction token by performing the accessor method associated with the branch instruction token.
-
-
25. A system according to claim 21, wherein the parser further comprises:
a module building a parse tree of the set of tokens, each such token representing a leaf in the parse tree and corresponding to an instruction tag in the script in the received conversation template.
-
26. A system according to claim 25, wherein the interpreter further comprises:
a module performing a depth first traversal of the parse tree following execution of an object instance corresponding to a non-terminal leaf in the parse tree.
-
27. A process for retrieving Web content onto a browser running on a remote client using a voice transceiver, the remote client and the voice transceiver both interconnected to a server via a network, comprising:
-
storing a conversation template on the server, the conversation template comprising a script including instruction tags for voice commands and voice prompts;
receiving the conversation template on the voice transceiver;
parsing the instruction tags from the script to form a set of interrelated tokens and instantiating an object corresponding to each token;
interpreting the set of tokens by executing the object instance corresponding to each token;
receiving a voice command on the voice transceiver from a user;
if said voice command contains data indicative of a request for Web content, sending a remote method invocation identifying the Web content to an applet associated with the browser running on the remote client, requesting the Web content from the server responsive to the remote method invocation, and receiving the Web content on the browser; and
if said voice command contains data indicative of a navigation command, sending the navigation command to the applet associated with the browser to request the browser to modify the content communicated by the browser. - View Dependent Claims (28, 29, 30, 31, 32)
providing a document type definition defining a format for the script and acceptable instruction tags; and
the operation of parsing the script further comprising validating each instruction tag against the document type definition.
-
-
29. A process according to claim 28, wherein each object instance includes an accessor method, the operation of interpreting the set of tokens further comprising:
-
determining those tokens related to each such token by performing the accessor method associated with the token; and
interpreting the set of related tokens.
-
-
30. A process according to claim 29, wherein at least one such token comprises a branch instruction token, the operation of executing the related tokens further comprising:
-
interrupting the operation of executing the related tokens upon the execution of the branch instruction token; and
determining those tokens related to the branch instruction token by performing the accessor method associated with the branch instruction token.
-
-
31. A process according to claim 27, wherein the operation of parsing the script further comprises:
building a parse tree of the set of tokens, each such token representing a leaf in the parse tree and corresponding to an instruction tag in the script in the received conversation template.
-
32. A process according to claim 31, wherein the operation of interpreting the parse tree further comprises:
performing a depth first traversal of the parse tree following execution of an object instance corresponding to a non-terminal leaf in the parse tree.
-
33. A computer-readable storage medium holding code for retrieving Web content onto a browser running on a remote client using a voice transceiver, the remote client and the voice transceiver both interconnected to a server via a network, comprising:
-
storing a conversation template on the server, the conversation template comprising a script including instruction tags for voice commands and voice prompts;
receiving the conversation template on the voice transceiver;
parsing the instruction tags from the script to form a set of interrelated tokens and instantiating an object corresponding to each token;
interpreting the set of tokens by executing the object instance corresponding to each token;
receiving a voice command on the voice transceiver from a user;
if said voice command contains data indicative of a request for Web content, sending a remote method invocation identifying the Web content to an applet associated with the browser running on the remote client, requesting the Web content from the server responsive to the remote method invocations, and receiving the Web content on the browser;
if said voice command contains data indicative of a navigation command, sending the navigation command to the applet associated with the browser to request the browser to modify the content communicated by the browser. - View Dependent Claims (34, 35, 36, 37, 38)
providing a document type definition defining a format for the script and acceptable instruction tags; and
the operation of parsing the script further comprising validating each instruction tag against the document type definition.
-
-
35. A storage medium according to claim 34, wherein each object instance includes an accessor method, the operation of interpreting the set of tokens further comprising:
-
determining those tokens related to each such token by performing the accessor method associated with the token; and
interpreting the set of related tokens.
-
-
36. A storage medium according to claim 35, wherein at least one such token comprises a branch instruction token, the operation of executing the related tokens further comprising:
-
interrupting the operation of executing the related tokens upon the execution of the branch instruction token; and
determining those tokens related to the branch instruction token by performing the accessor method associated with the branch instruction token.
-
-
37. A storage medium according to claim 33, wherein the operation of parsing the script further comprises:
building a parse tree of the set of tokens, each such token representing a leaf in the parse tree and corresponding to an instruction tag in the script in the received conversation template.
-
38. A storage medium according to claim 37, wherein the operation of interpreting the parse tree further comprises:
performing a depth first traversal of the parse tree following execution of an object instance corresponding to a non-terminal leaf in the parse tree.
Specification