Voice recognition of commands extracted from user interface screen devices
First Claim
1. A method comprising:
- using a computing system having at least one processor to perform a process, the process comprising;
creating a voice command mapping in response to loading a user interface page at the computing system by;
identifying a markup language description of the user interface page loaded at the computing system, the identification of the markup language description occurring after loading the user interface page at the computing system; and
generating the voice command mapping for the user interface page loaded at the computing system, wherein the voice command mapping maps a recognized word or phrase associated with at least one operation to one or more voice commands by parsing the markup language description identified from the user interface page to identify at least one user interface object specified by the markup language description configured to perform at least one operation responsive to a keyboard, mouse, or pointing device, the parsing of the markup language description being performed after loading the user interface page at the computing system, and the parsing does not create a modified version of the user interface page, wherein the voice command mapping uses a hash map data structure to store a relationship between at least one respective word or phrase to the at least one operation;
processing an utterance in response to receiving the utterance at the computing system, the computing system displaying the user interface page, by;
converting the utterance into a text representation of the utterance;
determining a plurality of matches between the text representation of the utterance and multiple matching voice commands based on the voice command mapping for the user interface page loaded at the computing system; and
performing a confirmation of a single matching voice command from among the plurality of matches.
1 Assignment
0 Petitions
Accused Products
Abstract
A method, system, and computer program product for human interface design. Embodiments proceed upon receiving a markup language description of user interface pages (e.g., HTML pages), then, without modifying the user interface page, parsing the markup language description to identify user interface objects configured to perform an operation responsive to a keyboard or mouse or pointing device. One or more mapping techniques serve to relate the parsed-out operation(s) to one or more voice commands. In some embodiments, the parser recognizes interface objects in forms such as a button, a textbox, a checkbox, or an option menu, and the voice commands correspond to an aspect that is displayed when rendering the interface object (e.g., a button label, a menu option, etc.). After receiving a user utterance, the utterance is converted into a text representation which in turn is mapped to voice commands that were parsed from the user interface page.
27 Citations
20 Claims
-
1. A method comprising:
using a computing system having at least one processor to perform a process, the process comprising; creating a voice command mapping in response to loading a user interface page at the computing system by; identifying a markup language description of the user interface page loaded at the computing system, the identification of the markup language description occurring after loading the user interface page at the computing system; and generating the voice command mapping for the user interface page loaded at the computing system, wherein the voice command mapping maps a recognized word or phrase associated with at least one operation to one or more voice commands by parsing the markup language description identified from the user interface page to identify at least one user interface object specified by the markup language description configured to perform at least one operation responsive to a keyboard, mouse, or pointing device, the parsing of the markup language description being performed after loading the user interface page at the computing system, and the parsing does not create a modified version of the user interface page, wherein the voice command mapping uses a hash map data structure to store a relationship between at least one respective word or phrase to the at least one operation; processing an utterance in response to receiving the utterance at the computing system, the computing system displaying the user interface page, by; converting the utterance into a text representation of the utterance; determining a plurality of matches between the text representation of the utterance and multiple matching voice commands based on the voice command mapping for the user interface page loaded at the computing system; and performing a confirmation of a single matching voice command from among the plurality of matches. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
10. A computer program product embodied in a non-transitory computer readable medium, the non-transitory computer readable medium having stored thereon a sequence of instructions which, when executed by a processor causes the processor to execute a process, the process comprising:
-
creating a voice command mapping in response to loading a user interface page at a computing system by; identifying a markup language description of the user interface page loaded at the computing system, the identification of the markup language description occurring after loading the user interface page at the computing system; and generating the voice command mapping for the user interface page loaded at the computing system, wherein the voice command mapping maps a recognized word or phrase associated with at least one operation to one or more voice commands by parsing the markup language description identified from the user interface page to identify at least one user interface object specified by the markup language description configured to perform at least one operation responsive to a keyboard, mouse, or pointing device, the parsing of the markup language description being performed after loading the user interface page at the computing system, and the parsing does not create a modified version of the user interface page, wherein the voice command mapping uses a hash map data structure to store a relationship between at least one respective word or phrase to the at least one operation; processing an utterance in response to receiving the utterance at the computing system, the computing system displaying the user interface page, by; converting the utterance into a text representation of the utterance; determining a plurality of matches between the text representation of the utterance and multiple matching voice commands based on the voice command mapping for the user interface page loaded at the computing system; and performing a confirmation of a single matching voice command from among the plurality of matches. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17)
-
-
18. A computer system comprising:
-
a parser module for processing data, wherein the parser module is stored in memory, the parser module to identify a markup language description of a user interface page loaded at the computer system, identification of the markup language description occurring after loading the user interface page at the computer system, and the parser module to generate a voice command mapping for the user interface page loaded at the computer system in response to loading the user interface page at the computer system, wherein the voice command mapping maps a recognized word or phrase associated with at least one operation to one or more voice commands by parsing the markup language description identified from the user interface page to identify at least one user interface object specified by the markup language description configured to perform at least one operation responsive to a keyboard, mouse, or pointing device, the parsing of the markup language description being performed after loading the user interface page at the computer system, and the parsing does not create a modified version of the user interface page, wherein the voice command mapping uses a hash map data structure to store a relationship between at least one respective word or phrase to the at least one operation; a receiving module for processing data, wherein the receiving module is stored in memory, the receiving module to process an utterance in response to receiving an utterance at the computer system displaying the user interface page and to convert the utterance into a text representation of the utterance wherein the utterance is used to determine a plurality of matches between the text representation of the utterance and multiple matching voice commands based on the voice command mapping for the user interface page loaded at the computer system; and a confirmation module for processing data, wherein the receiving module is stored in memory, the confirmation module to perform a confirmation of a single matching voice command from among the plurality of matches. - View Dependent Claims (19, 20)
-
Specification