Method and apparatus for speaker recognition via comparing an unknown input to reference data
First Claim
1. A method of speaker recognition comprising comparing an input signal representing speech from an unknown speaker with reference data representing speech from each of a plurality of pre-defined speakers, at least one of the pre-defined speakers being represented by at least two instances of reference data, the method comprising:
- comparing successive segments of the input signal with successive segments of the reference data and generating a comparison result for each successive segment, and, for each pre-defined speaker having at least two instances of reference data, the comparison result for the closest matching segment of reference data for each segment of the input signal are recorded to produce a composite comparison result for each successive segment for the said pre-defined speaker, and identifying the unknown speaker on the basis of the composite comparison results.
1 Assignment
0 Petitions
Accused Products
Abstract
A method and apparatus for pattern recognition comprising comparing an input signal representing an unknown pattern with reference data representing each of a plurality of pre-defined patterns, at least one of the pre-defined patterns being represented by at least two instances of reference data. Successive segments of the input signal are compared with successive segments of the reference data and comparison results for each successive segment are generated. For each pre-defined pattern having at least two instances of reference data, the comparison results for the closest matching segment of reference data for each segment of the input signal are recorded to produce a composite comparison result for the said pre-defined pattern. The unknown pattern is the identified on the basis of the comparison results. Thus the effect of a mismatch between the input signal and each instance of the reference data is reduced by selecting the best segments from the instances of reference data for each pre-defined pattern.
29 Citations
18 Claims
-
1. A method of speaker recognition comprising comparing an input signal representing speech from an unknown speaker with reference data representing speech from each of a plurality of pre-defined speakers, at least one of the pre-defined speakers being represented by at least two instances of reference data, the method comprising:
-
comparing successive segments of the input signal with successive segments of the reference data and generating a comparison result for each successive segment, and, for each pre-defined speaker having at least two instances of reference data, the comparison result for the closest matching segment of reference data for each segment of the input signal are recorded to produce a composite comparison result for each successive segment for the said pre-defined speaker, and identifying the unknown speaker on the basis of the composite comparison results. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
where N is the number of segments, w(n) is the weighting factor for the nth segment and d(n) is the comparison result for the nth segment.
-
-
6. A method according to claim 5 wherein
-
( n ) = [ 1 J ∑ j = 1 J d j ′ ( n ) ] - 1 ( 2 ) where J is the number of allowed speakers used to determine the weighting factor and d′
j (n) is the comparison result for the nth segment of the input signal and the nth segment of the jth model.
-
-
7. A method according to claim 6 for verifying the identity of an unknown speaker, wherein the unknown speaker provides information relating to a claimed identity and the comparison result for the reference data associated with the information is compared to the comparison results for the other reference data and, if a criterion is met, the unknown speaker is verified as the speaker associated with the information.
-
8. A method according to claim 7 wherein the weighting factor is dependent on the comparison results for J+1 allowed speakers, where J+1 speakers represent the identified speaker and the J speakers having comparison scores closest to that of the identified speaker.
-
9. A method according to claim 1 wherein the input signal is divided into frames and the segments are one frame in length.
-
10. Speaker recognition apparatus comprising:
-
an input for receiving an input signal representing speech from an unknown speaker;
reference data representing speech from each of a plurality of pre-defined speakers, at least one of the pre-defined speakers being represented by at least two instances of reference data, the method comprising;
comparing means for comparing successive segments of the input signal with successive segments of the reference data and generating a comparison result for each successive segment, decision means for generating, for each pre-defined pattern having at least two instances of reference data, a composite comparison result for the said pre-defined speaker from the comparison result for the closest matching segment of reference data for each segment of the input signal, and for identifying the unknown speaker on the basis of the composite comparison results. - View Dependent Claims (11, 12, 13, 18)
-
-
14. Apparatus according to 13 wherein the comparison result D is the average of the weighted result for each closest matching segment i.e.
-
∑ n = 1 N w ( n ) d ( n ) ( 1 ) where N is the number of segments, w(n) is the weighting factor for the nth segment and d(n) is the comparison result for the nth segment. - View Dependent Claims (15, 16, 17)
where J is the number of pre-determined patterns used to determine the weighting factor and d′
j (n) is the comparison result for the nth segment of the input signal and the nth segment of the jth model.
-
-
16. Apparatus according to claim 15 for verifying the identity of an unknown speaker, wherein the apparatus includes an input for receiving information relating to a claimed identity of the unknown speaker and the comparison result for the speaker corresponding to the claimed identity is compared to the other comparison scores and, if a criterion is met, the unknown speaker is verified as the corresponding speaker.
-
17. Apparatus method according to claim 16 wherein the weighting factor is dependent on the comparison results for J+1 allowed speakers, where J+1 speakers represent the identified speaker and the J speakers having comparison scores closest to that of the identified speaker.
Specification