Dynamic switching between local and remote speech rendering

US 8,024,194 B2
Filed: 12/08/2004
Issued: 09/20/2011
Est. Priority Date: 12/08/2004
Status: Expired due to Fees

First Claim

Patent Images

1. A computer-implemented method of running an application on a computing end device defining a host, the application being distributed from a server to the host, the method comprising:

obtaining the application from the server;

executing a multi-modal browser on the host to run the application, the application employing both text-to-speech (TTS) processing and automatic speech recognition (ASR) processing;

analyzing, by the host, an instruction in the application that instructs the host to perform the TTS processing and/or ASR processing locally or remotely;

determining, by the host, whether the host is capable of performing the TTS processing and/or ASR processing in accordance with the instruction, wherein;

if the instruction instructs the host to perform the TTS processing and the ASR processing locally, determining, by the host, whether the host supports performing the TTS processing and the ASR processing locally, andif the instruction instructs the host to perform the TTS processing locally and the ASR processing remotely, determining, by the host, whether the host supports performing the TTS processing locally and the ASR processing remotely;

if the host is capable of performing the TTS processing and/or ASR processing in accordance with the instruction, executing the application using the TTS processing and/or the ASR processing in accordance with the instruction; and

generating an error indication if the host is not capable of performing the TTS processing and/or ASR processing in accordance with the instruction.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A multimodal browser for rendering a multimodal document on an end system defining a host can include a visual browser component for rendering visual content, if any, of the multimodal document, and a voice browser component for rendering voice-based content, if any, of the multimodal document. The voice browser component can determine which of a plurality of speech processing configuration is used by the host in rendering the voice-based content. The determination can be based upon the resources of the host running the application. The determination also can be based upon a processing instruction contained in the application.

56 Citations

View as Search Results

12 Claims

1. A computer-implemented method of running an application on a computing end device defining a host, the application being distributed from a server to the host, the method comprising:
- obtaining the application from the server;
  
  executing a multi-modal browser on the host to run the application, the application employing both text-to-speech (TTS) processing and automatic speech recognition (ASR) processing;
  
  analyzing, by the host, an instruction in the application that instructs the host to perform the TTS processing and/or ASR processing locally or remotely;
  
  determining, by the host, whether the host is capable of performing the TTS processing and/or ASR processing in accordance with the instruction, wherein;
  
  if the instruction instructs the host to perform the TTS processing and the ASR processing locally, determining, by the host, whether the host supports performing the TTS processing and the ASR processing locally, andif the instruction instructs the host to perform the TTS processing locally and the ASR processing remotely, determining, by the host, whether the host supports performing the TTS processing locally and the ASR processing remotely;
  
  if the host is capable of performing the TTS processing and/or ASR processing in accordance with the instruction, executing the application using the TTS processing and/or the ASR processing in accordance with the instruction; and
  
  generating an error indication if the host is not capable of performing the TTS processing and/or ASR processing in accordance with the instruction.
- View Dependent Claims (2, 3, 4)
- - 2. The method of claim 1, wherein the application comprises a Web-based application and the server comprises a Web server.
  - 3. The method of claim 2, wherein the Web-based application comprises at least one multimodal Web page.
  - 4. The method of claim 3, wherein the at least one multimodal Web page is an XML document that specifies how a user interacts with the host and the application using a graphical user interface (GUI) and/or speech.

5. A non-transitory computer readable storage medium encoded with a plurality of instructions that, when executed on at least one processor, perform a method of running an application on an end device defining a host, the application being distributed from a server to the host, the method comprising:
- obtaining the application from the server;
  
  executing a multi-modal browser on the host to run the application, the application employing both text-to-speech (TTS) processing and automatic speech recognition (ASR) processing;
  
  analyzing, by the host, an instruction in the application that instructs the host to perform the TTS processing and/or ASR processing locally or remotely;
  
  determining, by the host, whether the host is capable of performing the TTS processing and/or ASR processing in accordance with the instruction, wherein;
  
  if the instruction instructs the host to perform the TTS processing and the ASR processing locally, determining, by the host, whether the host supports performing the TTS processing and the ASR processing locally, andif the instruction instructs the host to perform the TTS processing locally and the ASR processing remotely, determining, by the host, whether the host supports performing the TTS processing locally and the ASR processing remotely;
  
  if the host is capable of performing the TTS processing and/or ASR processing in accordance with the instruction, executing the application using the TTS processing and/or the ASR processing in accordance with the instruction; and
  
  generating an error indication if the host is not capable of performing the TTS processing and/or ASR processing in accordance with the instruction.
- View Dependent Claims (6, 7, 8)
- - 6. The non-transitory computer readable storage medium of claim 5, wherein the application comprises a Web-based application and the server comprises a Web server.
  - 7. The non-transitory computer readable storage medium of claim 6, wherein the Web-based application comprises at least one multimodal Web page.
  - 8. The non-transitory computer readable storage medium of claim 7, wherein the at least one multimodal Web page is an XML document that specifies how a user interacts with the host and the application using a graphical user interface (GUI) and/or speech.

9. A host device for running an application that is distributed from a server to the host over at least one network, comprising:
- communication means for communicating over the at least one network, the communication means capable of receiving the application over the at least one network; and
  
  at least one computer coupled to the communication means, the at least one computer programmed to;
  
  execute a multi-modal browser on the host to run the application, the application employing both text-to-speech (TTS) processing and automatic speech recognition (ASR) processing;
  
  analyze an instruction in the application that instructs the host to perform the TTS processing and/or ASR processing locally or remotely;
  
  determine whether the host is capable of performing the TTS processing and/or ASR processing in accordance with the instruction, wherein;
  
  if the instruction instructs the host to perform the TTS processing and the ASR processing locally, the host determines whether the host supports performing the TTS processing and the ASR processing locally, andif the instruction instructs the host to perform the TTS processing locally and the ASR processing remotely, the host determines whether the host supports performing the TTS processing locally and the ASR processing remotely;
  
  if the host is capable of performing the TTS processing and/or ASR processing in accordance with the instruction, execute the application using the TTS processing and/or the ASR processing in accordance with the instruction; and
  
  generate an error indication if the host is not capable of performing the TTS processing and/or ASR processing in accordance with the instruction.
- View Dependent Claims (10, 11, 12)
- - 10. The host device claim 9, wherein the application comprises a Web-based application and the server comprises a Web server.
  - 11. The host device of claim 10, wherein the Web-based application comprises at least one multimodal Web page.
  - 12. The host device of claim 11, wherein the at least one multimodal Web page is an XML document that specifies how a user interacts with the host and the application using a graphical user interface (GUI) and/or speech.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Nuance Communications, Inc. (Microsoft Corporation)
Original Assignee
Nuance Communications, Inc. (Microsoft Corporation)
Inventors
Cross, Charles W. Jr., McCobb, Gerald M., Jaramillo, David
Primary Examiner(s)
Wozniak; James S.
Assistant Examiner(s)
He; Jialong

Application Number

US11/007,830
Publication Number

US 20060122836A1
Time in Patent Office

2,477 Days
Field of Search

704/270.1, 704/270, 704/275, 704/231, 704/258, 704/260, 379/88.17, 709/310, 710/1, 710/36
US Class Current

704/270
CPC Class Codes

G10L 15/30   Distributed recognition, e....

H04M 1/72445   for supporting Internet bro...

H04M 2250/74   with voice recognition mean...

H04M 3/493   Interactive information ser...

Dynamic switching between local and remote speech rendering

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

56 Citations

12 Claims

Specification

Solutions

Use Cases

Quick Links

Dynamic switching between local and remote speech rendering

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

56 Citations

12 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links