Method and apparatus for improving the utility of speech recognition
First Claim
1. A method of improving the utility of speech recognition of words spoken by a speaker, comprising:
- a) capturing in electronic form using a telephone voice terminal connected to a telephone network a word spoken by the speaker, the word being captured at an access server which is accessed by the speaker using a connection over a voice grade telephone line;
b) passing the word to a speech recognition algorithm in the telephone network;
c) receiving from the speech recognition algorithm at least one representation of the word;
d) displaying for the speaker as text the at least one representation of the word to permit the speaker to select a correct representation of the word from among the at least one representation; and
e) repeating the steps of a)-d) in an event that none of the representation of the word are verified as correct, or enabling the speaker to communicate the at least one word to the access server in another way.
1 Assignment
0 Petitions
Accused Products
Abstract
A method and apparatus for improving the utility of speech recognition is described. The method involves capturing a spoken word, passing the spoken word to a speech recognition algorithm, receiving at least one text representation of the spoken word from the speech recognition algorithm, and passing a text representation of the spoken word to a display telephone to permit the user to select the correct representation of the voice response. The apparatus is an access server which communicates with a display telephone, a speech recognition algorithm which responds to queries from the access server and one or more databases which likewise respond to queries from the access server. The method and apparatus are particularly useful in automating such functions as telephone directory services using display telephones. The advantage is the ability to completely automate directory services for owners of display telephones and to significantly broaden the applications for speech recognition as a tool in information retrieval and transaction processing.
75 Citations
26 Claims
-
1. A method of improving the utility of speech recognition of words spoken by a speaker, comprising:
-
a) capturing in electronic form using a telephone voice terminal connected to a telephone network a word spoken by the speaker, the word being captured at an access server which is accessed by the speaker using a connection over a voice grade telephone line;
b) passing the word to a speech recognition algorithm in the telephone network;
c) receiving from the speech recognition algorithm at least one representation of the word;
d) displaying for the speaker as text the at least one representation of the word to permit the speaker to select a correct representation of the word from among the at least one representation; and
e) repeating the steps of a)-d) in an event that none of the representation of the word are verified as correct, or enabling the speaker to communicate the at least one word to the access server in another way. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A method of automating telephone directory services for a telephone user having a display telephone, comprising the steps of:
-
a) prompting the user for names used as indicia to locate an entity in the directory;
b) accepting from the user a spoken name for each index;
c) passing each spoken name to a speech recognition algorithm and accepting from the speech recognition algorithm at least one representation of the spoken name;
d) displaying as text on the display telephone the at least one representation of the spoken name to permit the user to select a correct representation of the spoken name; and
e) assembling a query to the directory after a correct representation of each index has been selected in order to retrieve a record for the entity from the directory. - View Dependent Claims (8, 9, 11, 12)
-
-
10. A method of automating telephone directory services for a telephone user having a display telephone as claimed 8 wherein the other way of entering the index comprises enabling the user to manually spell the spoken name using the dial pad of the display telephone.
-
13. Apparatus for improving the utility of speech recognition of words spoken by a speaker, comprising a server in a network enabled to receive voice and data signals over a voice grade connection in a switched telephone network, the server being programmed to prompted the speaker for spoken words which are received from the voice grade connection as voice signals and to pass the spoken words to a speech recognition algorithm which returns representations of the spoken words to the server;
- the server being further enabled to pass the representations of the spoken words to a voice terminal with a display surface which displays the representations for the speaker to permit the speaker to select a correct representation of the spoken words to thus improve the utility of the speech recognition of the words.
- View Dependent Claims (14, 15, 16)
-
17. A method of improving the utility of speech recognition of words spoken by a speaker, comprising:
-
a) capturing an electronic signal, using an Analog Display Services Interface (ADSI) telephone, representative of a word spoken by the speaker;
b) sending the electronic signal through the Public Switched Telephone Network (PSTN) to a speech recognition algorithm;
c) receiving via the PSTN from the speech recognition algorithm at least one representation of the word;
d) displaying on a display surface of the ADSI telephone the at least one representation of the word for the speaker, to permit the to select a correct representation of the word from among the at least one representation; and
e) repeating steps a)-c) in an event that none of the representations of the word are verified as correct, or enabling the speaker to communicate the at least one word using a key pad of the ADSI telephone. - View Dependent Claims (18, 19, 20)
-
-
21. Apparatus for improving the utility of speech recognition of words spoken by a speaker, comprising in combination:
a server in a network adapted to receive voice and data signals over a voice grade connection in a switched telephone network, the server being programmed to prompt the speaker for spoken words which are received via the voice grade connection as voice signals, and to pass the voice signals to a speech recognition algorithm that returns representations of, the spoken word to the server;
the server being further adapted to send the representations over the voice grade connection to an Analog Display Services Interface (ADSI) telephone, which displays the representation for the speaker to permit the speaker to select a correct representation of the spoken word to improve the utility of the speech recognition of the spoken words.- View Dependent Claims (22, 23)
-
24. A method of automatically information retrieval from a database for a telephone user having an Analog Display Service Interface (ADSI) telephone, comprising the steps of:
-
a) prompting the user for spoken words used as indicia to locate information of interest in the database;
b) accepting at least one of the spoken words from the user;
c) passing an electronic representation of each spoken word to a speech recognition algorithm and accepting from the speech recognition algorithm at least one representation of the spoken word;
d) displaying as text on the ADSI telephone the at least one representation of the spoken word to permit the user to select a correct representation of the spoken word; and
e) assembling a query to the database after a correct representation of each spoken word has been selected by the user, in order to retrieve the information from the database. - View Dependent Claims (25, 26)
-
Specification