Method and apparatus for training a speaker recognition system
First Claim
Patent Images
1. A method for training a speaker recognition system by a computer, comprising steps of:
- extracting speech parameters from a digitized audio signal to produce a set of differentiating factors (r);
storing said set of differentiating factors (r) in a data base to produce a stored set of differentiating factors (r);
polynomial pattern classifying by the computer said stored set of differentiating factors (r) to produce a first digital audio signature (w);
storing said first digital audio signature (w) in said data base to produce a stored first digital audio signature (w);
specifying a first speaker as having audio signature features X1, X2, . . . , XM ;
specifying a second speaker, as having audio signature features Y1, Y2. . . , YM ;
discriminating between said first speaker and said second speaker;
training for said first digital audio signature (w) features of said first speaker, a polynomial for a 2-norm to an ideal output of said first speaker, and an ideal output of 0 for said second speaker;
representing a matrix whose rows are a polynomial expansions of said first and second sneakers audio signature features, ##EQU5## and where o1 is a column vector of length 2M whose first M entries are 1 and remaining entries are 0, and o2 =1-o1 andtraining for said first speaker and said second speaker respectively being;
##EQU6##
4 Assignments
0 Petitions
Accused Products
Abstract
A method and apparatus for training a system to assess the identity of a person through the audio characteristics of their voice. The system inserts an audio input (10) into an A/D Converter (20) for processing in a digital signal processor (30). The system then applies Neural network type processing by using a polynomial pattern classifier (60) for training the speaker recognition system.
-
Citations
29 Claims
-
1. A method for training a speaker recognition system by a computer, comprising steps of:
-
extracting speech parameters from a digitized audio signal to produce a set of differentiating factors (r); storing said set of differentiating factors (r) in a data base to produce a stored set of differentiating factors (r); polynomial pattern classifying by the computer said stored set of differentiating factors (r) to produce a first digital audio signature (w); storing said first digital audio signature (w) in said data base to produce a stored first digital audio signature (w); specifying a first speaker as having audio signature features X1, X2, . . . , XM ; specifying a second speaker, as having audio signature features Y1, Y2. . . , YM ; discriminating between said first speaker and said second speaker; training for said first digital audio signature (w) features of said first speaker, a polynomial for a 2-norm to an ideal output of said first speaker, and an ideal output of 0 for said second speaker; representing a matrix whose rows are a polynomial expansions of said first and second sneakers audio signature features, ##EQU5## and where o1 is a column vector of length 2M whose first M entries are 1 and remaining entries are 0, and o2 =1-o1 and training for said first speaker and said second speaker respectively being;
##EQU6## - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
-
-
16. An apparatus for training a speaker recognition system comprising:
-
a processor for extracting speech parameters from a digitized audio signal to produce a set of differentiating factors (r); a computer for storing said set of differentiating factors (r) in a data base, said computer coupled to said processor; a polynomial pattern classifier operating on said set of differentiating factors (r) to produce a first digital audio signature (w); means for storing said digitized audio signature (w) in said data base to produce a stored first digital audio signature (w); means for specifying a first speaker as having audio signature features X1, X2, . . . , XM ; means for specifying a second speaker, as having audio signature features Y1, X2, . . . , YM ; means for discriminating between said first speaker and said second speaker; means for training said audio signature of said first and second speaker to provide a polynomial for a 2-norm to an ideal output of 1 for features of said first speaker and an ideal output of 0 for said second speaker; means for representing a matrix whose rows are a polynomial expansions of said first and second speakers audio signature features, ##EQU7## and where o1 is a column vector of length 2M whose first M entries are 1 and remaining entries are 0, and o2 =1-o1 ; and said means for training of said first speaker and said second speaker respectively being;
##EQU8## - View Dependent Claims (17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29)
-
Specification