Speaker independent speech recognition process
First Claim
1. A speaker independent speech recognition method comprising:
- analyzing an input analog speech signal;
dividing the analyzed speech signal into phonetic units;
comparing said phonetic units of the analyzed speech signal with a plurality of reference templates as stored in a phoneme dictionary, wherein each reference template is representative of at least a portion of a phoneme and is prepared in a training phase by dividing an acoustical space representing phonetic units spoken during training into domains, each of the domains of the acoustical space representing a plurality of phonetic units;
providing phonetic distribution tables associated with each of said reference templates stored in said phoneme dictionary as frequency tables, the probability of a particular phonetic unit being included in a domain being defined according to said frequency tables;
comparing a sequence of phonetic units of the analyzed speech signal with a plurality of words stored in a word lexicon in a phonetic form in accordance with said frequency tables; and
recognizing a particular word of the speech to be recognized as corresponding to a word stored in said word lexicon and having the maximum probability of its constituent phonetic units according to said frequency tables.
0 Assignments
0 Petitions
Accused Products
Abstract
According to this process, a speech signal is analyzed in a vector quantizer (1) in which the acoustic parameters are calculated for each interval of time of a predetermined value and are compared with each spectral reference template contained in a reference template dictionary (2) utilizing a distance calculation. The sequence obtained at the output of the vector quantizer (1) is then compared with each of the words stored in a word lexicon (5) in a phonetic form utilizing phonetic distribution tables (3) associated with each template. A particular word of the speech to be recognized is then recognized as corresponding to a word stored in the lexicon having the maximum probability of its constituent phonetic units according to the phonetic distribution tables.
57 Citations
10 Claims
-
1. A speaker independent speech recognition method comprising:
-
analyzing an input analog speech signal; dividing the analyzed speech signal into phonetic units; comparing said phonetic units of the analyzed speech signal with a plurality of reference templates as stored in a phoneme dictionary, wherein each reference template is representative of at least a portion of a phoneme and is prepared in a training phase by dividing an acoustical space representing phonetic units spoken during training into domains, each of the domains of the acoustical space representing a plurality of phonetic units; providing phonetic distribution tables associated with each of said reference templates stored in said phoneme dictionary as frequency tables, the probability of a particular phonetic unit being included in a domain being defined according to said frequency tables; comparing a sequence of phonetic units of the analyzed speech signal with a plurality of words stored in a word lexicon in a phonetic form in accordance with said frequency tables; and recognizing a particular word of the speech to be recognized as corresponding to a word stored in said word lexicon and having the maximum probability of its constituent phonetic units according to said frequency tables. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
Specification