Method for recognizing alphanumeric strings spoken over a telephone network
First Claim
1. A method, using a processing system, for recognizing character strings spoken by a caller over a telephone network, the processing system including a digital processor, means for interfacing to the telephone network and storage means for storing a predetermined set of reference character strings each having at least two characters, comprising the steps of:
- (a) initializing a cumulative recognition distance for each of the reference character strings to zero;
(b) prompting the caller to speak a character in a character string to be recognized;
(c) capturing and analyzing the spoken character;
(d) calculating a measure of acoustical dissimilarity between the spoken character and a corresponding character of each of the reference character strings to generate a recognition distance for each of the reference character strings;
(e) incrementing the cumulative recognition distance for each of the reference character strings by the recognition distance generated in step (d);
(f) repeating steps (b)-(e) for each successive character in the character string to be recognized and a corresponding character of each of the reference character strings;
(g) determining which of the reference character strings has a lowest cumulative recognition distance; and
(h) declaring the reference character string with the lowest cumulative recognition distance to be the character string spoken by the caller.
11 Assignments
0 Petitions
Accused Products
Abstract
The present invention describes a method for recognizing alphanumeric strings spoken over a telephone network wherein individual character recognition need not be uniformly high in order to achieve high string recognition accuracy. Preferably, the method uses a processing system having a digital processor, an interface to the telephone network, and a database for storing a predetermined set of reference alphanumeric strings. In operation, the system prompts the caller to speak each character of a string, beginning with a first character and ending with a last character. Each character is then recognized using a speaker-independent voice recognition algorithm. The method calculates recognition distances between each spoken input character and the corresponding letter or digit in the same position within each reference alphanumeric string. After each character is spoken, captured and analyzed, each reference string distance is incremented and the process is continued, accumulating distances for each reference string, until the last character is spoken. The reference string with the lowest cumulative distance is then declared to be the recognized string.
-
Citations
10 Claims
-
1. A method, using a processing system, for recognizing character strings spoken by a caller over a telephone network, the processing system including a digital processor, means for interfacing to the telephone network and storage means for storing a predetermined set of reference character strings each having at least two characters, comprising the steps of:
-
(a) initializing a cumulative recognition distance for each of the reference character strings to zero; (b) prompting the caller to speak a character in a character string to be recognized; (c) capturing and analyzing the spoken character; (d) calculating a measure of acoustical dissimilarity between the spoken character and a corresponding character of each of the reference character strings to generate a recognition distance for each of the reference character strings; (e) incrementing the cumulative recognition distance for each of the reference character strings by the recognition distance generated in step (d); (f) repeating steps (b)-(e) for each successive character in the character string to be recognized and a corresponding character of each of the reference character strings; (g) determining which of the reference character strings has a lowest cumulative recognition distance; and (h) declaring the reference character string with the lowest cumulative recognition distance to be the character string spoken by the caller. - View Dependent Claims (2, 3, 4, 5, 6, 9)
-
-
7. A method, using a processing system, for recognizing character strings spoken by a caller over a telephone network, the processing system including a digital processor, means for interfacing to the telephone network and storage means for storing a predetermined set of reference character strings each having at least two characters, comprising the steps of:
-
(a) initializing a combined recognition value for each of the reference character strings to zero; (b) prompting the caller to speak a character in a character string to be recognized; (c) capturing and analyzing the spoken character; (d) calculating a measure of acoustical similarity between the spoken character and a corresponding character of each of the reference character strings to generate a recognition value for each of the reference character strings; (e) incrementing the combined recognition value for each of the reference character strings by the recognition value generated in step (d); (f) repeating steps (b)-(e) for each successive character in the character string to be recognized and a corresponding character of each of the reference character strings; (g) determining which of the reference character strings has a highest combined recognition value; and (h) declaring the reference character string with the highest combined recognition value to be the character string spoken by the caller. - View Dependent Claims (10)
-
-
8. A method, using a processing system, for recognizing alphanumeric strings spoken by a caller over a telephone network, the processing system including a digital processor, means for interfacing to the telephone network and storage means for storing a predetermined set of reference alphanumeric strings each having at least two characters, comprising the steps of:
-
(a) initializing a cumulative recognition distance for each of the reference alphanumeric strings to zero; (b) prompting the caller to speak a first alphanumeric character in an alphanumeric string to be recognized; (c) capturing and analyzing the spoken first alphanumeric character; (d) calculating a measure of acoustical dissimilarity between the spoken first alphanumeric character and a first alphanumeric character of each of the reference alphanumeric strings to generate a recognition distance for each of the reference alphanumeric strings; (e) incrementing the cumulative recognition distance for each of the reference alphanumeric strings by the recognition distance generated in step (d); (f) prompting the caller to speak a second alphanumeric character in the alphanumeric string to be recognized; (g) capturing and analyzing the spoken second alphanumeric character; (h) calculating a measure of acoustical dissimilarity between the spoken second alphanumeric character and a second alphanumeric character of each of the reference alphanumeric strings to generate a recognition distance for each of the reference alphanumeric strings; (i) incrementing the cumulative recognition distance for each of the reference alphanumeric strings by the recognition distance generated in step (h); (j) determining which of the reference alphanumeric strings has a lowest cumulative recognition distance; and (k) declaring the reference alphanumeric string with the lowest cumulative recognition distance to be the alphanumeric string spoken by the caller.
-
Specification