Synchronizing visual and speech events in a multimodal application

US 8,055,504 B2
Filed: 04/03/2008
Issued: 11/08/2011
Est. Priority Date: 06/16/2005
Status: Active Grant

First Claim

Patent Images

1. A method for synchronizing visual and speech events in a multimodal application, the method comprising:

calling a voice form of the multimodal application, wherein the multimodal application is run using at least one computer processor, wherein the multimodal application provides a multimodal web page to a client device over a network;

receiving speech from a user;

determining a semantic interpretation of at least a portion of the speech using the voice form;

calling a global application update handler of the multimodal application and exiting the voice form;

identifying, by the global application update handler, an additional processing function in dependence upon the semantic interpretation, wherein the additional processing function is independent of the voice form; and

executing the additional processing function to synchronize visual and speech events in the multimodal application,wherein determining a semantic interpretation of at least a portion of the speech comprises determining a plurality of semantic interpretations of the at least a portion of the speech, andwherein identifying, by the global application update handler, an additional processing function in dependence upon the semantic interpretation comprises identifying, by the global application update handler, an additional processing function for each of the plurality of semantic interpretations.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Exemplary methods, systems, and products are disclosed for synchronizing visual and speech events in a multimodal application, including receiving from a user speech; determining a semantic interpretation of the speech; calling a global application update handler; identifying, by the global application update handler, an additional processing function in dependence upon the semantic interpretation; and executing the additional function. Typical embodiments may include updating a visual element after executing the additional function. Typical embodiments may include updating a voice form after executing the additional function. Typical embodiments also may include updating a state table after updating the voice form. Typical embodiments also may include restarting the voice form after executing the additional function.

Citations

17 Claims

1. A method for synchronizing visual and speech events in a multimodal application, the method comprising:
- calling a voice form of the multimodal application, wherein the multimodal application is run using at least one computer processor, wherein the multimodal application provides a multimodal web page to a client device over a network;
  
  receiving speech from a user;
  
  determining a semantic interpretation of at least a portion of the speech using the voice form;
  
  calling a global application update handler of the multimodal application and exiting the voice form;
  
  identifying, by the global application update handler, an additional processing function in dependence upon the semantic interpretation, wherein the additional processing function is independent of the voice form; and
  
  executing the additional processing function to synchronize visual and speech events in the multimodal application,wherein determining a semantic interpretation of at least a portion of the speech comprises determining a plurality of semantic interpretations of the at least a portion of the speech, andwherein identifying, by the global application update handler, an additional processing function in dependence upon the semantic interpretation comprises identifying, by the global application update handler, an additional processing function for each of the plurality of semantic interpretations.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The method of claim 1 further comprising updating a visual element after executing the additional processing function.
  - 3. The method of claim 1 further comprising updating a voice form after executing the additional processing function.
  - 4. The method of claim 3 further comprising updating a state table after updating the voice form.
  - 5. The method of claim 1 further comprising restarting the voice form after executing the additional processing function.
  - 6. The method of claim 1 wherein calling a global application update handler further comprises exiting a voice menu.

7. A system for synchronizing visual and speech events in a multimodal application, the system comprising:
- at least one computer processor;
  
  at least one computer memory operatively coupled to the computer processor; and
  
  computer program instructions disposed within the computer memory that, when executed, cause the at least one computer processor to;
  
  call a voice form of the multimodal application, wherein the multimodal application provides a multimodal web page to a client device over a network;
  
  receive speech from a user;
  
  determine a plurality of semantic interpretations of at least a portion of the speech using the voice form;
  
  call a global application update handler of the multimodal application and exit the voice form;
  
  identify, by the global application update handler, an additional processing function in dependence upon the semantic interpretation for each of the plurality of semantic interpretations, wherein the additional processing function is independent of the voice form; and
  
  execute the additional processing function to synchronize visual and speech events in the multimodal application.
- View Dependent Claims (8, 9, 10, 11, 12)
- - 8. The system of claim 7 further comprising computer program instructions disposed within the computer memory capable of updating a visual element after executing the additional processing function.
  - 9. The system of claim 7 further comprising computer program instructions disposed within the computer memory capable of updating a voice form after executing the additional processing function.
  - 10. The system of claim 9 further comprising computer program instructions disposed within the computer memory capable of updating a state table after updating the voice form.
  - 11. The system of claim 7 further comprising computer program instructions disposed within the computer memory capable of restarting the voice form after executing the additional processing function.
  - 12. The system of claim 7 wherein the computer program instructions disposed within the computer memory capable of exiting a voice menu.

13. A non-transitory computer-readable storage medium comprising instructions that, when executed on at least one processor in a computer, perform a method of synchronizing visual and speech events in a multimodal application, the method comprising:
- calling a voice form of the multimodal application, wherein the multimodal application provides a multimodal web page to a client device over a network;
  
  receiving speech from a user;
  
  determining a semantic interpretation of at least a portion of the speech using the voice form;
  
  calling a global application update handler of the multimodal application and exiting the voice form;
  
  identifying, by the global application update handler, an additional processing function in dependence upon the semantic interpretation, wherein the additional processing function is independent of the voice form; and
  
  executing the additional processing function to synchronize visual and speech events in the multimodal application,wherein determining a semantic interpretation of at least a portion of the speech comprises determining a plurality of semantic interpretations of the at least a portion of the speech; and
  
  wherein identifying, by the global application update handler, an additional processing function in dependence upon the semantic interpretation comprises identifying, by the global application update handler, an additional processing function for each of the plurality of semantic interpretations.
- View Dependent Claims (14, 15, 16, 17)
- - 14. The non-transitory computer-readable storage medium of claim 13 further comprising computer program instructions that update a visual element after executing the additional processing function.
  - 15. The non-transitory computer-readable storage medium of claim 13 further comprising computer program instructions that update a voice form after executing the additional processing function.
  - 16. The non-transitory computer-readable storage medium of claim 13 further comprising computer program instructions that restart the voice form after executing the additional processing function.
  - 17. The non-transitory computer-readable storage medium of claim 13 wherein computer program instructions that call a global application update handler further comprises exiting a voice menu.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Nuance Communications, Inc. (Microsoft Corporation)
Original Assignee
Nuance Communications, Inc. (Microsoft Corporation)
Inventors
Lewis, David B., Cross, Charles W., Pike, Hilary A., Wintermute, David W., Jablokov, Igor R., Smith, Daniel M., Hollinger, Michael C., Zaitzeff, Michael A.
Primary Examiner(s)
SAINT CYR, LEONARD

Application Number

US12/061,750
Publication Number

US 20080177530A1
Time in Patent Office

1,314 Days
Field of Search

704/9, 704/246, 704/251
US Class Current

704/270
CPC Class Codes

G10L 15/1815   Semantic context, e.g. disa...

G10L 15/22   Procedures used during a sp...

G10L 15/26   Speech to text systems G10L...

G10L 2021/105   Synthesis of the lips movem...

Synchronizing visual and speech events in a multimodal application

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

17 Claims

Specification

Solutions

Use Cases

Quick Links

Synchronizing visual and speech events in a multimodal application

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

17 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links