Methods and apparatus for voiced-enabling a web application

US 9,400,633 B2
Filed: 08/02/2012
Issued: 07/26/2016
Est. Priority Date: 08/02/2012
Status: Active Grant

First Claim

Patent Images

1. A method of enabling voice interaction for invoking at least one capability of a web application including at least one web page rendered by a web browser, the method comprising:

detecting, by an agent associated with the web browser, a first document object model event;

analyzing, in response to detecting the first document object model event, a document object model of the at least one web page to identify one or more items in the document object model at a first point in time;

determining based, at least in part, on the identified one or more items, that the at least one web page comprises the at least one capability at the first point in time;

enabling voice input to invoke the at least one capability of the web application in response to the identifying that the at least one web page comprises the at least one capability at the first point in time, wherein enabling voice input comprises updating at least one grammar associated with a speech engine based, at least in part, on the one or more items identified in the document object model at the first point in time;

detecting, by the agent, a second document object model event indicating that a context of the web application has changed since the first point in time;

analyzing, in response to detecting the second document object model event, the document object model of the at least one web page to identify at least one new item in the document object model at a second point in time; and

updating the at least one grammar based, at least in part, on the at least one new item identified in the document object model at the second point in time.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Methods and apparatus for voice-enabling a web application, wherein the web application includes one or more web pages rendered by a web browser on a computer. At least one information source external to the web application is queried to determine whether information describing a set of one or more supported voice interactions for the web application is available, and in response to determining that the information is available, the information is retrieved from the at least one information source. Voice input for the web application is then enabled based on the retrieved information.

Citations

11 Claims

1. A method of enabling voice interaction for invoking at least one capability of a web application including at least one web page rendered by a web browser, the method comprising:
- detecting, by an agent associated with the web browser, a first document object model event;
  
  analyzing, in response to detecting the first document object model event, a document object model of the at least one web page to identify one or more items in the document object model at a first point in time;
  
  determining based, at least in part, on the identified one or more items, that the at least one web page comprises the at least one capability at the first point in time;
  
  enabling voice input to invoke the at least one capability of the web application in response to the identifying that the at least one web page comprises the at least one capability at the first point in time, wherein enabling voice input comprises updating at least one grammar associated with a speech engine based, at least in part, on the one or more items identified in the document object model at the first point in time;
  
  detecting, by the agent, a second document object model event indicating that a context of the web application has changed since the first point in time;
  
  analyzing, in response to detecting the second document object model event, the document object model of the at least one web page to identify at least one new item in the document object model at a second point in time; and
  
  updating the at least one grammar based, at least in part, on the at least one new item identified in the document object model at the second point in time.
- View Dependent Claims (2, 3, 4)
- - 2. The method of claim 1, wherein enabling voice input comprises instructing a voice application to recognize voice commands corresponding to the at least one capability.
  - 3. The method of claim 2, further comprising:
    - receiving information from the voice application indicating that a user has spoken a voice command corresponding to the at least one capability; and
      
      performing at least one action in response to receiving the information from the voice application, wherein the at least one action is specified in a data structure accessible to an agent executing in the web browser.
  - 4. The method of claim 1, wherein determining that the at least one web page comprises the at least one capability comprises determining whether the at least one web page includes a particular user interface element.

5. A non-transitory computer-readable storage medium encoded with a plurality of instructions that, when executed by a computer, performs a method of enabling voice interaction for invoking at least one capability of a web application including at least one web page rendered by a web browser, the method comprising:
- detecting, by an agent associated with the web browser, a first document object model event;
  
  analyzing, in response to detecting the first document object model event, a document object model of the at least one web page to identify one or more items in the document object model at a first point in time;
  
  determining based, at least in part, on the identified one or more items, that the at least one web page comprises the at least one capability at the first point in time;
  
  enabling voice input to invoke the at least one capability of the web application in response to the identifying that the at least one web page comprises the at least one capability at the first point in time, wherein enabling voice input comprises updating at least one grammar associated with a speech engine based, at least in part, on the one or more items identified in the document object model at the first point in time;
  
  detecting, by the agent, a second document object model event indicating that a context of the web application has changed since the first point in time;
  
  analyzing, in response to detecting the second document object model event, the document object model of the at least one web page to identify at least one new item in the document object model at a second point in time; and
  
  updating the at least one grammar based, at least in part, on the at least one new item identified in the document object model at the second point in time.
- View Dependent Claims (6, 7, 8)
- - 6. The computer-readable storage medium of claim 5, wherein enabling voice input comprises instructing a voice application to recognize voice commands corresponding to the at least one capability.
  - 7. The computer-readable storage medium of claim 6, wherein the method further comprises:
    - receiving information from the voice application indicating that a user has spoken a voice command corresponding to the at least one capability; and
      
      performing at least one action in response to receiving the information from the voice application, wherein the at least one action is specified in a data structure accessible to an agent executing in the web browser.
  - 8. The computer-readable storage medium of claim 5, wherein determining that the at least one web page comprises the at least one capability comprises determining whether the at least one web page includes a particular user interface element.

9. A computer, comprising:
- at least one processor programmed to;
  
  detect a first document object model event;
  
  analyze, in response to detecting the first document object model event, a document object model of the at least one web page to identify one or more items in the document object model at a first point in time;
  
  determine based, at least in part, on the identified one or more items, that the at least one web page comprises the at least one capability at the first point in timeenable voice input to invoke the at least one capability of the web application in response to the identifying that the at least one web page comprises the at least one capability at the first point in time, wherein enabling voice input comprises updating at least one grammar associated with a speech engine based, at least in part, on the one or more items identified in the document object model at the first point in time;
  
  detect a second document object model event indicating that a context of the web application has changed since the first point in time;
  
  analyze, in response to detecting the second document object model event, the document object model of the at least one web page to identify at least one new item in the document object model at a second point in time; and
  
  update the at least one grammar based, at least in part, on the at least one new item identified in the document object model at the second point in time.
- View Dependent Claims (10, 11)
- - 10. The computer of claim 9, wherein enabling voice input comprises instructing a voice application to recognize voice commands corresponding to the at least one capability.
  - 11. The computer of claim 10, wherein the at least one processor is further programmed to:
    - receive information from the voice application indicating that a user has spoken a voice command corresponding to the at least one capability; and
      
      perform at least one action in response to receiving the information from the voice application, wherein the at least one action is specified in a data structure accessible to an agent executing in the web browser.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Nuance Communications, Inc. (Microsoft Corporation)
Inventors
Reich, David E., Hardy, Christopher
Primary Examiner(s)
Baderman, Scott
Assistant Examiner(s)
Velez-Lopez, Mario M

Application Number

US13/565,256
Publication Number

US 20140040722A1
Time in Patent Office

1,454 Days
Field of Search

715/234, 704/275
US Class Current

1/1
CPC Class Codes

G06F 17/00   Digital computing or data p...

G06F 3/167   Audio in a user interface, ...

G10L 15/22   Procedures used during a sp...

G10L 2015/228   of application context

H04M 3/4938   comprising a voice browser ...

Methods and apparatus for voiced-enabling a web application

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

11 Claims

Specification

Solutions

Use Cases

Quick Links

Methods and apparatus for voiced-enabling a web application

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

11 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links