Methods and apparatus for voice-enabling a web application
First Claim
1. A method of enabling voice interaction for at least one capability of a web application, wherein the web application includes a plurality of web pages rendered by a web browser, the method comprising:
- executing, with at least one computer processor, an agent for the web application, wherein the agent is configured to determine an identity of the web application;
determining, by the agent, whether the web application is in a first context or a second context by using Document Object Model (DOM) events in the web browser to identify at least one marker on a web page of the web application that identifies the web application as being in the first context or the second context, wherein the first context corresponds to a first state of the web application in which a first set of user interface elements is displayed on a first web page of the plurality of web pages of the web application and the second context corresponds to a second state of the web application in which a second set of user interface elements is displayed on a second web page of the plurality of web pages of the web application;
receiving first voice input;
enabling, when it is determined that the web application is in the first context, voice interaction for the at least one capability of the web application, wherein the at least one capability is not exposed by the web browser;
recognizing, by a voice application, one or more voice commands in the received first voice input when the voice interaction for the at least one capability of the web application is enabled, wherein the one or more voice commands are associated with the first context; and
performing at least one first action based, at least in part, on the one or more recognized voice commands.
2 Assignments
0 Petitions
Accused Products
Abstract
Methods and apparatus for voice-enabling a web application, wherein the web application includes one or more web pages rendered by a web browser on a computer. At least one information source external to the web application is queried to determine whether information describing a set of one or more supported voice interactions for the web application is available, and in response to determining that the information is available, the information is retrieved from the at least one information source. Voice input for the web application is then enabled based on the retrieved information.
-
Citations
20 Claims
-
1. A method of enabling voice interaction for at least one capability of a web application, wherein the web application includes a plurality of web pages rendered by a web browser, the method comprising:
-
executing, with at least one computer processor, an agent for the web application, wherein the agent is configured to determine an identity of the web application; determining, by the agent, whether the web application is in a first context or a second context by using Document Object Model (DOM) events in the web browser to identify at least one marker on a web page of the web application that identifies the web application as being in the first context or the second context, wherein the first context corresponds to a first state of the web application in which a first set of user interface elements is displayed on a first web page of the plurality of web pages of the web application and the second context corresponds to a second state of the web application in which a second set of user interface elements is displayed on a second web page of the plurality of web pages of the web application; receiving first voice input; enabling, when it is determined that the web application is in the first context, voice interaction for the at least one capability of the web application, wherein the at least one capability is not exposed by the web browser; recognizing, by a voice application, one or more voice commands in the received first voice input when the voice interaction for the at least one capability of the web application is enabled, wherein the one or more voice commands are associated with the first context; and performing at least one first action based, at least in part, on the one or more recognized voice commands. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A non-transitory computer-readable storage medium encoded with a plurality of instructions that, when executed by a computer, perform a method of enabling voice interaction for at least one capability of a web application, wherein the web application includes a plurality of web pages rendered by a web browser, the method comprising:
-
executing an agent for the web application, wherein the agent is configured to determine an identity of the web application; determining, by the agent, whether the web application is in a first context or a second context by using Document Object Model (DOM) events in the web browser to identify at least one marker on a web page of the web application that identifies the web application as being in the first context or the second context, wherein the first context corresponds to a first state of the web application in which a first set of user interface elements is displayed on a first web page of the plurality of web pages of the web application and the second context corresponds to a second state of the web application in which a second set of user interface elements is displayed on a second web page of the plurality of web pages of the web application; receiving first voice input; enabling, when it is determined that the web application is in the first context, voice interaction for the at least one capability of the web application, wherein the at least one capability is not exposed by the web browser; recognizing, by a voice application, one or more voice commands in the received first voice input when the voice interaction for the at least one capability is enabled, wherein the one or more voice commands are associated with the first context; and performing at least one first action based, at least in part, on the one or more recognized voice commands. - View Dependent Claims (10, 11, 12, 13, 14, 15)
-
-
16. A computer for enabling voice interaction for at least one capability of a web application, wherein the web application includes a plurality of web pages rendered by a web browser, the computer comprising:
-
a voice interface configured to receive first voice input; and at least one processor programmed to; execute an agent for the web application, wherein the agent is configured to determine an identity of the web application; determine, by the agent, whether the web application is in the first context or the second context by using Document Object Model (DOM) events in the web browser to identify at least one marker on a web page of the web application that identifies the web application as being in the first context or the second context, wherein the first context corresponds to a first state of the web application in which a first set of user interface elements is displayed on a first web page of the plurality of web pages of the web application and the second context corresponds to a second state of the web application in which a second set of user interface elements is displayed on a second web page of the plurality of web pages of the web application; enable, when it is determined that the web application is in the first context, voice interaction for the at least one capability of the web application, wherein the at least one capability is not exposed by the web browser; recognize, by a voice application, one or more voice commands in the received first voice input when the voice interaction for the at least one capability is enabled, wherein the one or more voice commands are associated with the first context; and perform at least one first action based, at least in part, on the one or more recognized voice commands. - View Dependent Claims (17, 18, 19, 20)
-
Specification