Speech recognition and speaker verification using distributed speech processing
First Claim
Patent Images
1. A method for processing a speech utterance in which a local computer accesses instructions from computer storage and executes the instructions to perform steps of:
- recording a speech utterance from a user using the local computer;
communicating between the local computer and a remote computer using a hyper text communication session, including;
sending the recording of the speech utterance from the local computer to the remote computer in the session; and
receiving a result from the remote computer, the result based on a processing of the recording at the remote computer including analyzing the speech utterance in the recording using a speech recognition application at the remote computer;
wherein the computer-implemented method further comprises using the local computer to receive a script that includes a universal resource locator of an application program that is run by the remote computer to process the recording, the script includes an instruction that instructs the local computer to perform a task based on the result received from the remote computer.
5 Assignments
0 Petitions
Accused Products
Abstract
Processing a speech utterance by communicating between a local computer and a remote computer using a hyper text communication session. The local computer sends a recording of a speech utterance to the remote computer in the session, and receives a result from the remote computer, the result based on a processing of the recording at the remote computer.
-
Citations
47 Claims
-
1. A method for processing a speech utterance in which a local computer accesses instructions from computer storage and executes the instructions to perform steps of:
-
recording a speech utterance from a user using the local computer; communicating between the local computer and a remote computer using a hyper text communication session, including; sending the recording of the speech utterance from the local computer to the remote computer in the session; and receiving a result from the remote computer, the result based on a processing of the recording at the remote computer including analyzing the speech utterance in the recording using a speech recognition application at the remote computer; wherein the computer-implemented method further comprises using the local computer to receive a script that includes a universal resource locator of an application program that is run by the remote computer to process the recording, the script includes an instruction that instructs the local computer to perform a task based on the result received from the remote computer. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
-
-
16. A computer-implemented method in which a computer accesses instructions from computer storage to execute a web browser process comprising steps of:
-
receiving a dialog file at the web browser; controlling a speech dialog using the received dialog file; receiving a speech utterance from a user as part of the speech dialog; encoding the speech utterance to generate an encoded speech utterance; sending a request from the web browser to a web server according to Hypertext Transfer Protocol, the request containing the encoded speech utterance; and receiving a response from the web server, the response containing a result based on a processing of the encoded speech utterance including analyzing the encoded speech utterance using a speech recognition application at the web server, wherein the computer-implemented method further comprises receiving a script at the web browser that includes a universal resource locator associated with the speech recognition application, the script includes an instruction that instructs the web browser to perform a task based on the result received from the web server. - View Dependent Claims (17, 18, 19, 20, 21, 22, 23)
-
-
24. A computer-implemented method in which a server accesses instructions from computer storage and executes the instructions to perform steps of:
-
sending a dialog file from the server to a client, the dialog file containing statements for processing by the client to control a speech dialog; receiving at the server a request from the client in response to the client processing one of the statements, the request containing an encoded speech utterance and being sent from the client to the server according to Hypertext Transfer Protocol; processing the encoded speech utterance by using the server including analyzing the encoded speech utterance using a speech recognition application; and sending a response from the server to the client, the response containing a result based on the processing of the encoded speech utterance; wherein the computer-implemented method further comprises sending a script to the client that includes a universal resource locator associated with the speech recognition application, the script includes an instruction that instructs the client to perform a task based on the result received from the server. - View Dependent Claims (25, 26, 27, 28, 29, 30)
-
-
31. A method comprising:
-
receiving a speech utterance from a user at a speech browser; encoding the speech utterance to generate an encoded speech utterance at the speech browser; sending a request from the speech browser through a network to a server in a hyper text communication session, the request containing the encoded speech utterance and an identifier to a speech recognition application at the server used to process the encoded speech utterance by performing speech recognition on the speech utterance and obtaining recognition results based on the speech recognition; and receiving at the speech browser a response from the server that contains the recognition result based on the processing of the encoded speech utterance at the server. - View Dependent Claims (32, 33)
-
-
34. An apparatus comprising:
-
means for receiving a speech utterance from a user and converting the speech utterance into a recording at a local computer; means for communicating between the local computer and a remote computer using a hyper text communication session; means for sending the recording of the speech utterance from the local computer to the remote computer in the session; means for receiving at the local computer a result from the remote computer, the result based on a processing of the recording at the remote computer, wherein the processing of the recording at the remote computer includes analyzing the speech utterance using a speech recognition application at the remote computer; and means for using the local computer to receive a script that includes a universal resource locator of the speech recognition application that is run by the remote computer to process the recording, the script includes an instruction that instructs the local computer to perform a task based on the result received from the remote computer. - View Dependent Claims (35, 36, 37, 38, 39)
-
-
40. Computer-readable media comprising software for causing a computer system to perform functions comprising:
-
recording a speech utterance from a user using a local computer; communicating between the local computer and a remote computer using a hyper text communication session, including sending the recording of a speech utterance from the local computer to the remote computer in the session; receiving a result from the remote computer, the result based on a processing of the recording at the remote computer, wherein the processing includes analyzing the speech utterance in the recording using a speech recognition application at the remote computer; and using the local computer to receive a script that includes a universal resource locator associated with the speech recognition application that is run by the remote computer to process the recording, the script includes an instruction that instructs the local computer to perform a task based on the result received from the remote computer.
-
-
41. Computer-readable media comprising software for causing a computer system to perform functions comprising:
-
receiving a dialog file at a web browser; controlling a speech dialog using the received dialog file; receiving a speech utterance from a user as part of the speech dialog; encoding the speech utterance to generate an encoded speech utterance; sending a request from the web browser to a web server, the request containing the encoded speech utterance; receiving a response from the web server, the response containing a result based on a processing of the encoded speech utterance including analyzing the encoded speech utterance using a speech recognition application at the web server; and using the web browser to receive a script that includes a universal resource locator associated with the speech recognition application, the script includes an instruction that instructs the web browser to perform a task based on the result received from the web server.
-
-
42. Computer-readable media comprising software for causing a computer system to perform functions comprising:
-
sending a dialog file from a server to a client, the dialog file containing statements for processing by the client to control a speech dialog; receiving at a server a request from the client in response to the client processing one of the statements, the request containing an encoded speech utterance; processing the encoded speech utterance by using the server including analyzing the encoded speech utterance using a speech recognition application at the server; sending a response from the server to the client, the response containing a result based on processing of the encoded speech utterance; and sending a script from the server to the client, the script includes a universal resource locator associated with the speech recognition application and an instruction that instructs the client to perform a task based on the result received from the server.
-
-
43. Computer-readable media comprising software for causing a computer system to perform functions comprising:
-
receiving a speech utterance from a user at a speech browser; encoding the speech utterance to generate an encoded speech utterance at the speech browser; sending a request from the speech browser through a network to a server in a hyper text communication session, the request containing the encoded speech utterance and an identifier to an application at the server used to process the speech utterance by performing speech recognition on the speech utterance and obtaining recognition results based on the speech recognition; and receiving a response at the speech browser from the server that contains the recognition result based on the processing of the encoded speech utterance.
-
-
44. An apparatus comprising:
-
an input port to receive a speech utterance from a user as part of a speech dialog; and a web browser to receive a dialog file and control the speech dialog using the received dialog file, the web browser being configured to encode the speech utterance to generate an encoded speech utterance, to send a request containing the encoded speech utterance to a web server, and to receive a response from the web server, where the response a speech recognition result based on a speech recognition processing of the encoded speech utterance at the web server; wherein the web browser receives a script that includes a universal resource locator associated with the web server, the script includes an instruction that instructs the web browser to perform a task based on the speech recognition result received from the web server.
-
-
45. A server computer comprising:
-
a storage to store a dialog file containing statements for processing by a client to control a speech dialog; an input/output port to send the dialog file to the client and to receive a request using a hyper text communication session from the client in response to the client processing one of the statements, the request containing an encoded speech utterance; and a speech recognition application to process the encoded speech utterance and to send a response containing a result based on the speech recognition processing of the encoded speech utterance to the client; wherein the server computer sends a script to client, the script includes a universal resource locator associated with the server computer and an instruction that instructs the client to perform a task based on the result.
-
-
46. A voice-enabled device comprising:
-
an input/output interface to receive a speech utterance from a user; a voice-enabled application at a speech browser configured to encode the speech utterance to generate an encoded speech utterance and send a request from the speech browser through a network to a server in a hyper text communication session, the request containing the encoded speech utterance and an identifier to a speech recognition application at the server used to process the speech utterance, the voice-enabled application further configured to receive a response from the server that contains a speech recognition result based on a processing of the encoded speech utterance at the server and to perform a function at the speech browser based on the speech recognition result; wherein the voice-enabled application receives a script that includes a universal resource locator associated with the speech recognition application, the script includes an instruction that instructs the voice-enabled application to perform a task based on the speech recognition result received from the server.
-
-
47. A telephone call center comprising:
-
a call manager to receive a speech utterance of a user transmitted through a telephone network, the call manager configured to determine a telephone number dialed by the user to connect the user to the telephone call center, the call manager further configured determine a universal resource locator (URL) based on the telephone number; and a client computer to run a speech browser application that performs the functions of; retrieving a script based on the URL provided by the call manager, encoding the speech utterance into an encoded speech utterance, sending a request through a network to a server in a hyper text communication session, the request containing the encoded speech utterance and an identifier to a speech recognition application at the server used to process the speech utterance; receiving a response from the server that contains a recognition result based on a speech processing of the encoded speech utterance; and using the client computer to receive a script that includes a universal resource locator associated with the speech recognition application, the script includes an instruction that instructs the client computer to perform a task based on the recognition result.
-
Specification