Methods and apparatus for voice-enabling a web application
First Claim
1. A method of determining a collective set of supported voice interactions for a plurality of frames of a web page displayed in a window of a web browser, wherein each of the plurality of frames includes content for a different web application, wherein the content for each of the plurality of frames is displayed simultaneously in the window of the web browser, wherein the plurality of frames includes a first frame and a second frame, wherein the first frame displays content for a first web application rendered by the web browser and the second frame displays content for a second web application rendered by the web browser, wherein the first web application is different from the second web application, the method comprising:
- identifying a first data structure that includes information identifying a plurality of contexts of the first web application and supported voice interactions for the first web application in each of the plurality of contexts of the first web application;
determining a first current context of the first web application, wherein determining the first current context comprises analyzing whether a particular marker is present in the content displayed in the first frame;
determining based, at least in part, on the first current context of the first web application and the information included in the first data structure, a first set of supported voice interactions available for the first frame;
identifying a second data structure that includes information identifying a plurality of contexts of the second web application and supported voice interactions for the second web application in each of the plurality of contexts of the second web application;
determining based, at least in part, on a second current context of the second web application and the information included in the second data structure, a second set of supported voice interactions available for the second frame;
determining the collective set of supported voice interactions based on the first set of supported voice interactions and the second set of voice interactions; and
instructing an external speech engine to recognize voice input corresponding to the collective set of voice interactions.
2 Assignments
0 Petitions
Accused Products
Abstract
Methods and apparatus for voice-enabling a web application, wherein the web application includes one or more web pages rendered by a web browser on a computer. At least one information source external to the web application is queried to determine whether information describing a set of one or more supported voice interactions for the web application is available, and in response to determining that the information is available, the information is retrieved from the at least one information source. Voice input for the web application is then enabled based on the retrieved information.
-
Citations
20 Claims
-
1. A method of determining a collective set of supported voice interactions for a plurality of frames of a web page displayed in a window of a web browser, wherein each of the plurality of frames includes content for a different web application, wherein the content for each of the plurality of frames is displayed simultaneously in the window of the web browser, wherein the plurality of frames includes a first frame and a second frame, wherein the first frame displays content for a first web application rendered by the web browser and the second frame displays content for a second web application rendered by the web browser, wherein the first web application is different from the second web application, the method comprising:
-
identifying a first data structure that includes information identifying a plurality of contexts of the first web application and supported voice interactions for the first web application in each of the plurality of contexts of the first web application; determining a first current context of the first web application, wherein determining the first current context comprises analyzing whether a particular marker is present in the content displayed in the first frame; determining based, at least in part, on the first current context of the first web application and the information included in the first data structure, a first set of supported voice interactions available for the first frame; identifying a second data structure that includes information identifying a plurality of contexts of the second web application and supported voice interactions for the second web application in each of the plurality of contexts of the second web application; determining based, at least in part, on a second current context of the second web application and the information included in the second data structure, a second set of supported voice interactions available for the second frame; determining the collective set of supported voice interactions based on the first set of supported voice interactions and the second set of voice interactions; and instructing an external speech engine to recognize voice input corresponding to the collective set of voice interactions. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A non-transitory computer-readable storage medium encoded with a plurality of instructions that, when executed by a computer, perform a method of determining a collective set of supported voice interactions for a plurality of frames of a web page displayed in a window of a web browser, wherein each of the plurality of frames includes content for a different web application, wherein the content for each of the plurality of frames is displayed simultaneously in the window of the web browser, wherein the plurality of frames includes a first frame and a second frame, wherein the first frame displays content for a first web application rendered by the web browser and the second frame displays content for a second web application rendered by the web browser, wherein the first web application is different from the second web application, the method comprising:
-
identifying a first data structure that includes information identifying a plurality of contexts of the first web application and supported voice interactions for the first web application in each of the plurality of contexts of the first web application; determining a first current context of the first web application, wherein determining the first current context comprises analyzing whether a particular marker is present in the content displayed in the first frame; determining based, at least in part, on the first current context of the first web application and the information included in the first data structure, a first set of supported voice interactions available for the first frame; identifying a second data structure that includes information identifying a plurality of contexts of the second web application and supported voice interactions for the second web application in each of the plurality of contexts of the second web application; determining based, at least in part, on a second current context of the second web application and the information included in the second data structure, a second set of supported voice interactions available for the second frame; determining the collective set of supported voice interactions based on the first set of supported voice interactions and the second set of voice interactions; and instructing an external speech engine to recognize voice input corresponding to the collective set of voice interactions. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A computer for determining a collective set of supported voice interactions for a plurality of frames of a web page displayed in a window of a web browser, wherein each of the plurality of frames includes content for a different web application, wherein the content for each of the plurality of frames is displayed simultaneously in the window of the web browser, wherein the plurality of frames includes a first frame and a second frame, wherein the first frame displays content for a first web application rendered by the web browser and the second frame displays content for a second web application rendered by the web browser, wherein the first web application is different from the second web application, the computer comprising:
at least one processor programmed to; identify a first data structure that includes information identifying a plurality of contexts of the first web application and supported voice interactions for the first web application in each of the plurality of contexts of the first web application; determine a first current context of the first web application, wherein determining the first current context comprises analyzing whether a particular marker is present in the content displayed in the first frame; determine based, at least in part, on the first current context of the first web application and the information included in the first data structure, a first set of supported voice interactions available for the first frame; identify a second data structure that includes information identifying a plurality of contexts of the second web application and supported voice interactions for the second web application in each of the plurality of contexts of the second web application; determine based, at least in part, on a second current context of the second web application and the information included in the second data structure, a second set of supported voice interactions available for the second frame; determine the collective set of supported voice interactions based on the first set of supported voice interactions and the second set of voice interactions; and instruct an external speech engine to recognize voice input corresponding to the collective set of voice interactions. - View Dependent Claims (16, 17, 18, 19, 20)
Specification