Method, system, and apparatus for speech recognition
First Claim
1. A speech recognition system comprising:
- correspondence information, said correspondence information storing a correspondence between recognized words and a plurality of speech element arrays expressing pronunciation of said recognized words;
said speech recognition system recognizing a recognizable word from a received user spoken utterance by comparing a speech element array generated from said user spoken utterance with said plurality of speech element arrays in said correspondence information;
wherein, in a dialog of a single person occurring within a certain period of time, said generated speech element array corresponds to one of said plurality of speech element arrays, a pronunciation prediction probability corresponding to said one of said plurality of speech element arrays is lowered, said pronunciation prediction probability being different from said generated speech element array.
2 Assignments
0 Petitions
Accused Products
Abstract
The present invention can be used to improve speech recognition accuracy, especially with regard to characters, words and the like which can correspond to a plurality of readings. The same person can be apt to maintain the same reading in the same conversation. For example, a person who pronounced “7” “shichi” is apt to pronounce “shichi” consistently in the conversation. By utilizing this tendency, recognition from the second time is executed after reducing a recognition probability corresponding to the reading, which is not used by the person in the first response of the conversation. In the case where a system repeats a recognition result by speech synthesis, the system repeats the recognition result corresponding to the reading of a speaker that is already recognized. For example, when the speaker pronounced “7” “shichi”, the system pronounces “shichi” at the time of repetition.
-
Citations
36 Claims
-
1. A speech recognition system comprising:
-
correspondence information, said correspondence information storing a correspondence between recognized words and a plurality of speech element arrays expressing pronunciation of said recognized words;
said speech recognition system recognizing a recognizable word from a received user spoken utterance by comparing a speech element array generated from said user spoken utterance with said plurality of speech element arrays in said correspondence information;
wherein, in a dialog of a single person occurring within a certain period of time, said generated speech element array corresponds to one of said plurality of speech element arrays, a pronunciation prediction probability corresponding to said one of said plurality of speech element arrays is lowered, said pronunciation prediction probability being different from said generated speech element array. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
-
-
15. A speech recognition method for use within a dialog of a single person, said dialog occurring in a certain period of time, said method comprising:
-
receiving a first user spoken utterance and generating a first speech element array from said first user spoken utterance;
searching correspondence information, said correspondence information associating recognizable words with a plurality of speech element arrays expressing pronunciation of said recognizable words;
generating a first recognized word by comparing said first speech element array and said plurality of speech element arrays in said correspondence information;
lowering a pronunciation prediction probability of one of said plurality of speech element arrays which differs from said first speech element array, wherein said one of said plurality of speech element arrays is made to correspond to said first speech element array;
receiving a second user spoken utterance and generating a second speech element array from said second user spoken utterance;
searching said correspondence information comprising said lowered pronunciation prediction probability; and
generating a second recognized word by comparing said second speech element array and said plurality of speech element arrays in said correspondence information. - View Dependent Claims (16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28)
-
-
29. A machine readable storage, having stored thereon a computer program having a plurality of code sections executable by a machine for causing the machine to perform the steps of:
-
receiving a first user spoken utterance and generating a first speech element array from said first user spoken utterance;
searching correspondence information, said correspondence information comprising a correspondence between recognizable words and a plurality of speech element arrays expressing pronunciation of said recognizable words;
generating a recognized word by comparing said first speech element array and said plurality of speech element arrays in said correspondence information; and
lowering a pronunciation prediction probability of one of said plurality of speech element arrays which differs from said first speech element array, wherein said one of said plurality of speech element arrays is made to correspond to said first speech element array.
-
-
30. A machine readable storage, having stored thereon a computer program having a plurality of code sections executable by a machine for causing the machine to perform the steps of:
-
receiving a first user spoken utterance and generating a first speech element array from said first user spoken utterance;
searching correspondence information, said correspondence information associating recognizable words with a plurality of speech element arrays expressing pronunciation of said recognizable words;
generating a first recognized word by comparing said first speech element array and said plurality of speech element arrays in said correspondence information;
lowering a pronunciation prediction probability of one of said plurality of speech element arrays which differs from said first speech element array, wherein said one of said plurality of speech element arrays is made to correspond to said first speech element array;
receiving a second user spoken utterance and generating a second speech element array from said second user spoken utterance;
searching said correspondence information comprising said lowered pronunciation prediction probability; and
generating a second recognized word by comparing said second speech element array and said plurality of speech element arrays in said correspondence information. - View Dependent Claims (31, 32, 33, 34, 35, 36)
-
Specification