Web enabled recognition architecture

US 7,506,022 B2
Filed: 09/20/2001
Issued: 03/17/2009
Est. Priority Date: 05/04/2001
Status: Expired due to Fees

First Claim

Patent Images

1. A server/client system for processing data, the system comprising:

a network comprising;

a web server having information accessible remotely;

a recognition server;

a first client device adapted to receive information from the web server and having a visual interface browser to access information from the web server and a rendering device to visually indicate fields to be entered, the first client device configured to record input speech data associated with each of the fields upon an indication by a user of the first client device of which field subsequent input is intended for, and wherein the first client device is adapted to send the input speech data to the recognition server remote from the first client device;

a second client device, remote from the first client device, having a microphone and a speaker and adapted to receive information from the web server, the second client device configured to record input speech data associated with each of a set of fields in response to prompts given to a user of the second client device, and wherein the second client device is adapted to send the input speech data to the same recognition server as used by the first client device, the recognition server being remote from the second client device, wherein the second client device comprises a telephone and a voice browser capable of rendering the information from the web server audibly; and

wherein the recognition server is configured to receive the input speech data from both of the client devices separately, process the input speech data from each client device using an associated grammar, and return data indicative of what was recognized to at least one of the client device providing the input speech data and the web server; and

wherein the recognition server is configured to receive data indicative of a prompt for the user to be used when the recognition results are indicative of no recognition of the input speech data from one of the client devices, convert the data indicative of the prompt to audible speech data when the recognition results are indicative of no recognition of the input speech data from said one of the client devices, and send the audible speech data to said one of the client devices over the wide area network.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A server/client system for processing data includes a network having a web server with information accessible remotely. A client device includes a microphone and a rendering component such as a speaker or display. The client device is configured to obtain the information from the web server and record input data associated with fields contained in the information. The client device is adapted to send the input data to a remote location with an indication of a grammar to use for recognition. A recognition server receives the input data and the indication of the grammar. The recognition server returns data indicative of what was recognized to at least one of the client and the web server.

Citations

24 Claims

1. A server/client system for processing data, the system comprising:
- a network comprising;
  
  a web server having information accessible remotely;
  
  a recognition server;
  
  a first client device adapted to receive information from the web server and having a visual interface browser to access information from the web server and a rendering device to visually indicate fields to be entered, the first client device configured to record input speech data associated with each of the fields upon an indication by a user of the first client device of which field subsequent input is intended for, and wherein the first client device is adapted to send the input speech data to the recognition server remote from the first client device;
  
  a second client device, remote from the first client device, having a microphone and a speaker and adapted to receive information from the web server, the second client device configured to record input speech data associated with each of a set of fields in response to prompts given to a user of the second client device, and wherein the second client device is adapted to send the input speech data to the same recognition server as used by the first client device, the recognition server being remote from the second client device, wherein the second client device comprises a telephone and a voice browser capable of rendering the information from the web server audibly; and
  
  wherein the recognition server is configured to receive the input speech data from both of the client devices separately, process the input speech data from each client device using an associated grammar, and return data indicative of what was recognized to at least one of the client device providing the input speech data and the web server; and
  
  wherein the recognition server is configured to receive data indicative of a prompt for the user to be used when the recognition results are indicative of no recognition of the input speech data from one of the client devices, convert the data indicative of the prompt to audible speech data when the recognition results are indicative of no recognition of the input speech data from said one of the client devices, and send the audible speech data to said one of the client devices over the wide area network.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
- - 2. The system of claim 1 wherein each client device is adapted to normalize the input speech data prior to sending the input speech data to the recognition server.
  - 3. The system of claim 1 wherein the information received from the web server and provided to each of the client devices is a markup language.
  - 4. The system of claim 3 wherein the markup language received by the client devices comprises one or several markup portions and one or several script portions.
  - 5. The system of claim 4 wherein the markup language comprises one of HTML, XHTML, cHTML, XML and WML.
  - 6. The system of claim 4 wherein the markup language includes an indication associating a grammar with a field, the indication having the same form from each of the client devices.
  - 7. The system of claim 6 wherein the recognition server receives the input speech data and the indication of the grammar.
  - 8. The system of claim 7 wherein the grammar is stored on each of the client devices and transferred to the recognition server with the input speech data.
  - 9. The system of claim 1 wherein each of the client devices is adapted to normalize the input speech data prior to sending the input speech data to the recognition server.
  - 10. The system of claim 1 wherein the web server includes a server side plug-in module for dynamically generating markup language for each of the client devices.
  - 11. The system of claim 10 wherein the server side plug-in module dynamically generates markup language as a function of the type of client device.
  - 12. The system of claim 11 wherein the server side plug-in module detects the type of client device.
  - 13. The system of claim 10 wherein the web server includes a plurality of dialog modules accessible by the server side plug-in module, each dialog module pertaining to obtaining data using speech recognition, the server side plug-in module generating the markup language as a function of a dialog module.
  - 14. The system of claim 1 wherein the web server and the recognition server are located on a single machine.

15. A server/client system for processing data, the system comprising:
- a network comprising;
  
  a web server having information accessible remotely;
  
  a recognition server;
  
  a first client device receiving information from the web server and having a visual interface browser to access information from the web server and a rendering device to visually indicate fields to be entered, the first client device recording input data associated with each of the fields upon an indication by a user of the first client device of which field subsequent input is intended for, and wherein the first client device sends the input data to the recognition server;
  
  a second client device, remote from the first client device, comprising a telephone and having a browser, a microphone and an audible rendering component, the second client device configured obtaining the information from the web server, the information having corresponding fields, the second client device further recording input data associated with each of the fields, and wherein the second client device is adapted to send the input data to the recognition server; and
  
  wherein the recognition server is remote from and operatively connected to the web server, the first client device and the second client device via a wide area network, the same recognition server receiving the input data from both the first client device and the second client device separately and returning data indicative of what was recognized based on an associated grammar to at least one of the client devices providing the input data and the web server; and
  
  wherein the recognition server receives data indicative of a prompt for the user to be used when the recognition results are indicative of no recognition of the input from one of the client devices, converts the data indicative of the prompt to audible speech data when the recognition results are indicative of no recognition of the input from said one of the client devices, and sends the audible speech data to said one of the client devices over the wide area network.
- View Dependent Claims (16, 17, 18, 19)
- - 16. The system of claim 15 wherein the information received from the web server and provided to each of the client devices is a markup language.
  - 17. The system of claim 16 wherein the markup language comprises one of HTML, XHTML, cHTML, XML and WML.
  - 18. The system of claim 15 wherein the grammar is stored on at least one of the client devices and transferred to the recognition server with the input data.
  - 19. The system of claim 15 wherein the grammar is stored on the recognition server and wherein the indication of the grammar includes a reference to the grammar for the recognition server.

20. A method for processing voice recognition in a client/server system comprising:
- transmitting information from a web server a markup language page having extensions configured to obtain input data from a user a first client device and a user of a second client device, wherein the first client device and the second client device are remote from each other and communicate with the web server over a wide area network, the first client device having a visual interface browser to access information from the web server and a visual rendering device, and the second client device comprising a telephone and a voice browser to access information from the web server;
  
  rendering the markup language page on each of the client devices;
  
  obtaining input data as a function of input from each of the users of the corresponding client devices;
  
  transmitting the input data and an indication of an associated grammar over the wide area network to a single recognition server remote from each of the client devices, the recognition server being connected to the wide area network;
  
  processing the input data with the associated grammar using the single recognition server;
  
  transmitting a recognition result from the single recognition server indicative of what was inputted from each client device over the wide area network to at least one of the corresponding client device providing the input and the web server;
  
  receiving over the wide area network and at the single recognition server data indicative of a prompt for the user to be used when the recognition results are indicative of no recognition of the input from one of the client devices;
  
  converting at the single recognition server the data indicative of the prompt to audible speech data when the recognition results are indicative of no recognition of the input from said one of the client devices; and
  
  sending the audible speech data to said one of the client devices over the wide area network.
- View Dependent Claims (21, 22, 23, 24)
- - 21. The method of claim 20 wherein rendering the markup language includes rendering audible prompts.
  - 22. The method of claim 20 wherein the markup language comprises one of HTML, XHTML, cHTML, XML and WML.
  - 23. The method of claim 20 wherein transmitting the indication of the grammar comprises transmitting the grammar.
  - 24. The method of claim 20 wherein transmitting the indication of the grammar comprises transmitting a reference to the recognition server as to where the grammar is located.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Microsoft Corporation
Inventors
Hon, Hsiao-Wuen, Wang, Kuansan
Primary Examiner(s)
Barqadle; Yasin M

Application Number

US09/960,232
Publication Number

US 20030009517A1
Time in Patent Office

2,735 Days
Field of Search

709201-203, 709/206, 709217-219, 709227-230, 704270-278, 704/251, 704/27.1
US Class Current

709/203
CPC Class Codes

G06F 1/1626   with a single-body enclosur...

G06F 1/1698   the I/O peripheral being a ...

G06F 3/16   Sound input; Sound output s...

G06F 3/167   Audio in a user interface, ...

G06F 40/117   Tagging; Marking up details...

G10L 15/30   Distributed recognition, e....

H04M 1/271   controlled by voice recogni...

H04M 1/72445   for supporting Internet bro...

H04M 2207/40   terminals with audio html b...

H04M 2250/74   with voice recognition means

H04M 3/493   Interactive information ser...

H04M 3/4936   Speech interaction details ...

Web enabled recognition architecture

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

24 Claims

Specification

Solutions

Use Cases

Quick Links

Web enabled recognition architecture

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

24 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links