Speaker identification and verification system
First Claim
Patent Images
1. A method for speaker recognition comprising the steps of:
- windowing a speech segment into a plurality of speech frames;
determining linear prediction coefficients from a linear predictive polynomial for each said frame of speech;
determining a first cepstral coefficient from said linear prediction coefficients in which first cepstrum information comprises said first cepstral coefficient;
applying an all pole filter to said linear prediction polynomial;
determining a plurality of roots of said linear prediction polynomial from the poles of said all pole filter, each said root including a residue component;
selecting one of said frames having a predetermined number of said roots within a unit circle of the z-plane in which said selected frames form said predetermined components of said first cepstrum information;
applying weightings to predetermined components from said first cepstrum information for producing an adaptive component weighting cepstrum to attenuate broad bandwidth components in said speech signal, by determining a finite impulse response filter for emphasizing the speech formants of said speech signal and attenuating said residue components comprising the steps of determining a finite impulse response filter for emphasizing the speech formants of said speech signal and attentuating said residue components, determining adaptive component weighting coefficients from said finite impulse response filter, determining a second cepstral coefficient from said adaptive component weighting coefficients, and subtracting said second cepstral coefficient from said first cepstral coefficient for forming said adaptive component weighting cepstrum; and
recognizing said adaptive component weighting cepstrum by calculating similarity of said adaptive component weighting cepstrum and a plurality of speech patterns which were produced by a plurality of speaking persons in advance.
3 Assignments
0 Petitions
Accused Products
Abstract
The present invention relates to a speaker recognition method and system which applies adaptive component weighting to each frame of speech for attenuating non-vocal tract components and normalizing speech components. A linear predictive all pole model is used to select frames for an adaptively weighted cepstrum. Frames with a predetermined number of resonances are selected for cepstrum analysis. An adaptively weighted cepstrum is determined from a new transfer function. A normalized cepstrum is determined having improved characteristics for speech components. From the improved speech components, improved speaker recognition over a channel is obtained.
-
Citations
11 Claims
-
1. A method for speaker recognition comprising the steps of:
-
windowing a speech segment into a plurality of speech frames; determining linear prediction coefficients from a linear predictive polynomial for each said frame of speech; determining a first cepstral coefficient from said linear prediction coefficients in which first cepstrum information comprises said first cepstral coefficient; applying an all pole filter to said linear prediction polynomial; determining a plurality of roots of said linear prediction polynomial from the poles of said all pole filter, each said root including a residue component; selecting one of said frames having a predetermined number of said roots within a unit circle of the z-plane in which said selected frames form said predetermined components of said first cepstrum information; applying weightings to predetermined components from said first cepstrum information for producing an adaptive component weighting cepstrum to attenuate broad bandwidth components in said speech signal, by determining a finite impulse response filter for emphasizing the speech formants of said speech signal and attenuating said residue components comprising the steps of determining a finite impulse response filter for emphasizing the speech formants of said speech signal and attentuating said residue components, determining adaptive component weighting coefficients from said finite impulse response filter, determining a second cepstral coefficient from said adaptive component weighting coefficients, and subtracting said second cepstral coefficient from said first cepstral coefficient for forming said adaptive component weighting cepstrum; and recognizing said adaptive component weighting cepstrum by calculating similarity of said adaptive component weighting cepstrum and a plurality of speech patterns which were produced by a plurality of speaking persons in advance.
-
-
2. A system for speaker recognition comprising:
-
means for converting a speech signal into a plurality of frames of digital speech; speech parameter extracting means for converting said digital speech into first cepstrum information, said speech parameter extracting means comprising an all pole linear predictive (LPC) filter means, for determining a plurality of roots of said LPC filter, each said root including a residue component, and means for selecting ones of said frames having a predetermined number of said roots within a unit circle of the z-plane wherein said selected frames form said predetermined components of said first cepstrum information; speech parameter enhancing means for applying adaptive weightings to said first cepstrum information for producing an adaptive component weighting cepstrum to attenuate broad bandwidth components in said speech signal, said speech parameter enhancing means comprising, a finite impulse response filter for emphasizing the speech formants of said speech signal and attenuating said residue components, means for computing adaptive component weighting coefficients from said finite impulse response filter, means for computing a second cepstral coefficient from said adaptive component weighting coefficients, and means for subtracting said second cepstral coefficient from said first cepstral coefficient for forming said adaptive component weighting cepstrum; and evaluation means for determining a similarity of said adaptive component weighting cepstrum with a plurality of speech samples which were produced by a plurality of speaking persons in advance.
-
-
3. A method for speaker recognition comprising the steps of:
-
windowing a speech segment into a plurality of speech frames; determining linear prediction coefficients from a linear predictive polynomial for each said frame of speech; determining a first cepstral coefficient from said linear prediction coefficients in which first cepstrum information comprises said first cepstral coefficient; applying an all pole filter to said linear prediction polynomial; determining a plurality of roots of said linear prediction polynomial from the poles of said all pole filter, each said root including a residue component; selecting one of said frames having a predetermined number of said roots within a unit circle of the z-plane in which said selected frames form said predetermined components of said first cepstrum information; applying weightings to predetermined components from said first cepstrum information for producing an adaptive component weighting cepstrum to attentuate broad bandwidth components in said speech signal, by determining a finite impulse response filter for emphasizing the speech formants of said speech signal and attentuating said residue components and determining adaptive component weighting coefficients from said finite impulse response filter; and recognizing said adaptive component weighting cepstrum by calculating similarity of said adaptive component weighting cepstrum and a plurality of speech patterns which were produced by a plurality of speaking persons in advance. - View Dependent Claims (4, 5, 6, 7)
-
-
8. A system for speaker recognition comprising:
-
means for converting a speech signal into a plurality of frames of digital speech; speech parameter extracting means for converting said digital speech into first cepstrum information, said speech parameter extracting means comprising an all pole linear predictive (LPC) filter means, for determining a plurality of roots of said LPC filter, each said root including a residue component, and means for selecting ones of said frames having a predetermined number of said roots within a unit circle of the z-plane wherein said selected frames form said predetermined components of said first cepstrum information; speech parameter enhancing means for applying adaptive weightings to said first cepstrum information for producing an adaptive component weighting cepstrum to attenuate broad bandwidth components in said speech signal, said speech parameter enhancing means comprising, a finite impulse response filter for emphasizing the speech formants of said speech signal and attenuating said residue components, means for computing adaptive component weighting coefficients from said finite impulse response filter; and evaluation means for determining a similarity of said adaptive component weighting cepstrum with a plurality of speech samples which were produced by a plurality of speaking persons in advance. - View Dependent Claims (9, 10, 11)
-
Specification