Speech recognition system
First Claim
1. A method of recognizing speech wherein a reference feature vector system is partitioned in memory into a reference first portion of feature vectors, which has a constant time duration independent of a speaker, and a reference second portion of feature vectors, which has a time duration dependent on a speaker, and said reference feature vector system is compared to unknown speech having an unknown first portion of feature vectors and an unknown second portion of feature vectors, comprising the steps of:
- (a) locating a first portion of feature vectors in said reference feature vector system,(b) locating unwarped candidate first portions in said unknown speech by shifting said reference first portion through said unknown speech and comparing said reference first portion with said unknown speech,(c) matching said reference first portion with one of said candidate first portions in said unknown speech; and
(d) matching said reference second portion with said unknown second portion by linearly designating each feature vector of said unknown second portion to a feature vector in said reference second portion.
0 Assignments
0 Petitions
Accused Products
Abstract
Speech recognition with time warp is simplified by finding a certain portion of a word whose time duration is the same for all speakers. In comparing an unknown speech with a reference speech, the time duration of an unknown speech is coincided with the time length of a reference speech with the two processes. According to the invention, an element vector of a speech is classified to the first portion and the second portion. The former is a consonant and co-articulation which couples the two sounds, and the latter is a vowel. The length of the first portion is almost independent from a speaker, and the length of the second portion depends upon a speaker. Therefore, the present invention matches the first portion of an unknown speech with that of the reference speech directly without changing the time length. Next, the sample elements in the second portion of the unknown speech is linearly matched with that of a reference speech. Thus, excellent recognition is obtained using a simple calculation.
14 Citations
3 Claims
-
1. A method of recognizing speech wherein a reference feature vector system is partitioned in memory into a reference first portion of feature vectors, which has a constant time duration independent of a speaker, and a reference second portion of feature vectors, which has a time duration dependent on a speaker, and said reference feature vector system is compared to unknown speech having an unknown first portion of feature vectors and an unknown second portion of feature vectors, comprising the steps of:
-
(a) locating a first portion of feature vectors in said reference feature vector system, (b) locating unwarped candidate first portions in said unknown speech by shifting said reference first portion through said unknown speech and comparing said reference first portion with said unknown speech, (c) matching said reference first portion with one of said candidate first portions in said unknown speech; and (d) matching said reference second portion with said unknown second portion by linearly designating each feature vector of said unknown second portion to a feature vector in said reference second portion. - View Dependent Claims (2, 3)
-
Specification