Method for continuous recognition of alphanumeric strings spoken over a telephone network
First Claim
1. A method, using a processing system, for recognizing character strings spoken by a caller over a telephone network, the processing system including a digital processor, means for interfacing to the telephone network and storage means for storing a predetermined set of reference character strings each having at least two characters, comprising the steps of:
- (a) initializing a cumulative recognition distance for each of the reference character strings to zero;
(b) prompting the caller to speak characters in a character string to be recognized, the character string to be recognized having at least first and second characters;
(c) analyzing the character string spoken by the caller to locate a boundary between the first and second characters of the spoken character string;
(d) calculating a measure of acoustical dissimilarity between the spoken first character and the first character of each of the reference character strings to generate a recognition distance for each of the reference character strings;
(e) incrementing the cumulative recognition distance for each of the reference character strings by the recognition distance generated in step (d);
(f) calculating a measure of acoustical dissimilarity between the spoken second character and the second character of each of the reference character strings to generate a recognition distance for each of the reference character strings;
(g) incrementing the cumulative recognition distance for each of the reference character strings by the recognition distance generated in step (f);
(h) determining which of the reference character strings has a lowest cumulative recognition distance; and
(i) declaring the reference character string with the lowest cumulative recognition distance to be the character string spoken by the caller.
10 Assignments
0 Petitions
Accused Products
Abstract
The present invention describes a method for recognizing alphanumeric strings spoken over a telephone network wherein individual character recognition need not be uniformly high in order to achieve high string recognition accuracy. Preferably, the method uses a processing system having a digital processor, an interface to the telephone network, and a database for storing a predetermined set of reference alphanumeric strings. In operation, the system prompts the caller to speak the characters of a string, and characters are recognized using a speaker-independent voice recognition algorithm. The method calculates recognition distances between each spoken input character and the corresponding letter or digit in the same position within each reference alphanumeric string. After each character is spoken, captured and analyzed, each reference string distance is incremented and the process is continued, accumulating distances for each reference string, until the last character is spoken. The reference string with the lowest cumulative distance is then declared to be the recognized string.
-
Citations
8 Claims
-
1. A method, using a processing system, for recognizing character strings spoken by a caller over a telephone network, the processing system including a digital processor, means for interfacing to the telephone network and storage means for storing a predetermined set of reference character strings each having at least two characters, comprising the steps of:
-
(a) initializing a cumulative recognition distance for each of the reference character strings to zero; (b) prompting the caller to speak characters in a character string to be recognized, the character string to be recognized having at least first and second characters; (c) analyzing the character string spoken by the caller to locate a boundary between the first and second characters of the spoken character string; (d) calculating a measure of acoustical dissimilarity between the spoken first character and the first character of each of the reference character strings to generate a recognition distance for each of the reference character strings; (e) incrementing the cumulative recognition distance for each of the reference character strings by the recognition distance generated in step (d); (f) calculating a measure of acoustical dissimilarity between the spoken second character and the second character of each of the reference character strings to generate a recognition distance for each of the reference character strings; (g) incrementing the cumulative recognition distance for each of the reference character strings by the recognition distance generated in step (f); (h) determining which of the reference character strings has a lowest cumulative recognition distance; and (i) declaring the reference character string with the lowest cumulative recognition distance to be the character string spoken by the caller. - View Dependent Claims (2, 3, 4)
-
-
5. A method, using a processing system, for recognizing character strings spoken by a caller over a telephone network, the processing system including a digital processor, means for interfacing to the telephone network and storage means for storing a predetermined set of reference character strings each having at least two characters, comprising the steps of:
-
(a) initializing a combined recognition value for each of the reference character strings to zero; (b) prompting the caller to speak characters in a character string to be recognized, the character string to be recognized having at least first and second characters; (c) analyzing the character string spoken by the caller to locate a boundary between the first and second characters of the spoken character string; (d) calculating a measure of acoustical similarity between the spoken first character and the first character of each of the reference character strings to generate a recognition value for each of the reference character strings; (e) incrementing the combined recognition value for each of the reference character strings by the recognition value generated in step (d); (f) calculating a measure of acoustical similarity between the spoken second character and the second character of each of the reference character strings to generate a recognition value for each of the reference character strings; (g) incrementing the combined recognition value for each of the reference character strings by the recognition value generated in step (f); (h) determining which of the reference character strings has a highest combined recognition value; and (i) declaring the reference character string with the highest combined recognition value to be the character string spoken by the caller. - View Dependent Claims (6, 7, 8)
-
Specification