Method and system for dynamic speech recognition using free-phone scoring
First Claim
1. A method for recognizing a speech utterance as a predetermined unit of speech, the method comprising:
- requesting and receiving a claim of identity;
accessing a customer database in response to receiving said claim of identity;
wherein a predetermined unit of speech is stored in said customer database;
generating a subword spelling of said predetermined unit of speech;
accessing a subword model database that stores a plurality of phonemes, each phoneme representing a sound, to construct a subword model comprising a series of phonemes;
requesting and receiving a speech utterance;
generating a free-phone model of the speech utterance without accessing said customer database;
calculating a free-phone score, said free-phone score representing a likelihood that said free-phone model accurately represents the speech utterance;
calculating a word score, said word score representing a likelihood said subword spelling accurately represents said speech utterance;
determining, based upon said free-phone score and said word score, whether the speech utterance matches the predetermined unit of speech by;
calculating a confidence score based upon said word score and said free-phone score by (i) comparing said word score to said free-phone score, and (ii) if said word score is not better than said free-phone score, comparing said word score to a fraction of said free-phone score.
20 Assignments
0 Petitions
Accused Products
Abstract
A system performing a speech recognition process requests and receives a claim of identity. The system accesses a customer database and generates a subword spelling of a stored text string where the text string includes predetermined information. The system accesses a subword model database to construct a subword model comprising a series of phonemes. The system requests and receives a speech utterance. A free-phone model of the speech utterance is generated independently of the information stored in the customer database. The system generates a free-phone score. The system generates a word score. The system determines whether the speech utterance matches the stored text string based on the free-phone score and the word score.
52 Citations
10 Claims
-
1. A method for recognizing a speech utterance as a predetermined unit of speech, the method comprising:
-
requesting and receiving a claim of identity;
accessing a customer database in response to receiving said claim of identity;
wherein a predetermined unit of speech is stored in said customer database;
generating a subword spelling of said predetermined unit of speech;
accessing a subword model database that stores a plurality of phonemes, each phoneme representing a sound, to construct a subword model comprising a series of phonemes;
requesting and receiving a speech utterance;
generating a free-phone model of the speech utterance without accessing said customer database;
calculating a free-phone score, said free-phone score representing a likelihood that said free-phone model accurately represents the speech utterance;
calculating a word score, said word score representing a likelihood said subword spelling accurately represents said speech utterance;
determining, based upon said free-phone score and said word score, whether the speech utterance matches the predetermined unit of speech by;
calculating a confidence score based upon said word score and said free-phone score by (i) comparing said word score to said free-phone score, and (ii) if said word score is not better than said free-phone score, comparing said word score to a fraction of said free-phone score. - View Dependent Claims (2, 3, 4)
comparing said confidence score to a threshold confidence score.
-
-
3. The method of claim 2 wherein said step of calculating a confidence score includes calculating a weighted average of said word score and said free-phone score.
-
4. The method of claim 2 wherein said step of determining includes determining a difference between said word score and said free-phone score.
-
5. An article of manufacture comprising:
-
a computer readable medium having computer readable program code means embodied thereon, said computer readable program code means comprising means for generating a free-phone score based upon an utterance, and means for determining, based upon said free-phone score, whether said utterance matches a unit of speech;
wherein said computer readable program code means further includes means for generating a word score based upon said unit of speech and wherein said means for determining determines whether said utterance matches said unit of speech based upon said free-phone score and said word score.
-
-
6. A device comprising:
-
a memory device having a representation of a unit of speech stored therein;
a processor coupled to said memory device, said processor configured to generate a free-phone score based upon a speech utterance and further configured to determine whether said speech utterance is said unit of speech based upon said free-phone score;
an input device for receiving said speech utterance, said input device coupled to said processor;
wherein said processor further generates a word score based upon said unit of speech and determines whether said speech utterance is said unit of speech based upon said free-phone score and said word score;
wherein said processor is configured to compare said word score to said free-phone score, and determine whether said word score is better than said free-phone score;
wherein said processor is further configured to compare said word score to a fraction of said free-phone score. - View Dependent Claims (7, 8, 9)
-
-
10. A speech recognition system for recognizing a speech utterance as a predetermined unit of speech, the system comprising:
-
means for requesting and receiving a claim of identity;
means for accessing a customer database in response to receiving said claim of identity;
wherein a predetermined unit of speech is stored in said customer database;
means for generating a subword spelling of said predetermined unit of speech;
means for accessing a subword model database that stores a plurality of phonemes, each phoneme representing a sound, to construct a subword model comprising a series of phonemes;
means for requesting and receiving a speech utterance;
means for generating a free-phone model of said speech utterance without accessing said customer database;
means for calculating a free-phone score, said free-phone score representing a likelihood that said free-phone model accurately represents said speech utterance;
means for calculating a word score, said word score representing a likelihood said subword spelling accurately represents said speech utterance;
means for determining, based upon said free-phone score and said word score, whether said speech utterance matches said predetermined unit of speech by;
calculating a confidence score based upon said word score and said free-phone score by (i) comparing said word score to said free-phone score, and (ii) if said word score is not better than said free-phone score, comparing said word score to a fraction of said free-phone score.
-
Specification