Speaker recognizer in which a significant part of a preselected one of input and reference patterns is pattern matched to a time normalized part of the other
First Claim
1. A speaker recognizing system comprising:
- input time sequence producing means responsive to an input speech sound, spoken by a speaker to be recognized and comprising a significant sound of a predetermined nature informative of said speaker, for producing an input time sequence of feature vectors representative of said input speech sound;
significant sound specifying means responsive to said input speech sound for producing a sound nature signal which comprises a significant sound signal specifying said significant sound;
specific time sequence producing means for producing a specific time sequence of feature vectors representative of a specific speech sound spoken by a specific speaker, said specific speech sound comprising a significant sound informative of said specific speaker;
time normalizing means for time normalizing said input time sequence and said specific time sequences relative to each other to derive first and second normalized time sequences of feature vectors from said input time sequence and said specific time sequence, respectively;
similarity measure calculating means responsive to said sound nature signal and said first and said second normalized time sequences for calculating a similarity measure between those feature vectors of said normalized time sequences of feature vectors which are selected from said first and said second normalized time sequences in compliance with said significant sound signal, respectively, said similarity measure calculating means producing a similarity measure signal representative of the calculated similarity measure; and
means responsive to said similarity measure signal for recognizing whether or not the speaker to be recognized is said specific speaker.
1 Assignment
0 Petitions
Accused Products
Abstract
Speaker recognition is decided by a similarity measure (D) calculated from comparing selected feature vectors among an input speech signal sequence of feature vectors (A) and a selected sequence (B) of reference vectors selected from a plurality of pre-stored reference sequences. Prior to comparison of the input and reference vector sequences, the two sequences are time normalized to align corresponding feature vectors. A significant sound specifying signal (V) including a time sequence of elementary signals is generated in synchronism with one of the input and reference sequences and indicates which feature vectors in that one of the input and reference sequences are considered to represent significant sound. The similarity measure (D) is then calculated in accordance with the comparison of those feature vectors in the one sequence which are indicated by the significant sound specifying signal as representing significant sound and the corresponding feature vectors of the other sequence.
-
Citations
4 Claims
-
1. A speaker recognizing system comprising:
-
input time sequence producing means responsive to an input speech sound, spoken by a speaker to be recognized and comprising a significant sound of a predetermined nature informative of said speaker, for producing an input time sequence of feature vectors representative of said input speech sound; significant sound specifying means responsive to said input speech sound for producing a sound nature signal which comprises a significant sound signal specifying said significant sound; specific time sequence producing means for producing a specific time sequence of feature vectors representative of a specific speech sound spoken by a specific speaker, said specific speech sound comprising a significant sound informative of said specific speaker; time normalizing means for time normalizing said input time sequence and said specific time sequences relative to each other to derive first and second normalized time sequences of feature vectors from said input time sequence and said specific time sequence, respectively; similarity measure calculating means responsive to said sound nature signal and said first and said second normalized time sequences for calculating a similarity measure between those feature vectors of said normalized time sequences of feature vectors which are selected from said first and said second normalized time sequences in compliance with said significant sound signal, respectively, said similarity measure calculating means producing a similarity measure signal representative of the calculated similarity measure; and means responsive to said similarity measure signal for recognizing whether or not the speaker to be recognized is said specific speaker. - View Dependent Claims (2)
-
-
3. A speaker recognizing system comprising:
-
specific time sequence producing means for producing a specific time sequence of feature vectors representative of a specific speech sound spoken by a specific speaker, said specific speech sound comprising a significant sound of a predetermined nature informative of said specific speaker; significant sound specifying means for producing a sound nature signal which comprises a significant sound signal specifying said significant sound; input time sequence producing means responsive to an input speech sound spoken by a speaker to be recognized and comprising a significant sound informative of the speaker to be recognized for producing an input time sequence of feature vectors representative of said input speech sound; time normalizing means for time normalizing said input and said specific time sequences relative to each other to derive first and second normalized time sequences of feature vectors from said input and said specific time sequences, respectively, to produce said first and said second normalized time sequences; similarity measure calculating means responsive to said sound nature signal and said first and said second normalized time sequences for calculating a similarity measure between those feature vectors of said first and second normalized time sequences of feature vectors which are selected from said first and said second normalized time sequences in compliance with said significant sound signal, respectively, said similarity measure calculating means producing a similarity measure signal representative of the calculated similarity measure; and means responsive to said similarity measure signal for recognizing whether or not the speaker to be recognized is said specific speaker. - View Dependent Claims (4)
-
Specification