Computer voice recognition method verifying speaker identity using speaker and non-speaker data

US 6,298,323 B1
Filed: 07/25/1997
Issued: 10/02/2001
Est. Priority Date: 07/25/1996
Status: Expired due to Term

First Claim

Patent Images

1. A method for verifying a person on the basis of voice signals using a neural network, the method comprising the steps of:

in a training phase, (A) generating and storing a reference feature vector from training phase voice signal generated by a speaker to be verified, (B) generating and storing an anti-reference feature vector from a voice anti-signal generated by a speaker not to be verified, (C) training the neural network using the reference feature vector and the anti-reference feature vector, thereby adapting weightings of the neural network to permit an optimum classification for a two-class problem;

and in an operating phase, (D) generating a feature vector from an operating phase voice signal generated by an unknown person, who may or may not be the speaker to be verified, (E) submitting the feature vector, the stored reference feature vector and the stored anti-feature vector to the neural network for a only a single comparison between (a) the feature vector and the reference feature vector and only a single comparison between (b) the feature vector and the anti-reference feature vector, wherein said comparisons are capable of being made when said anti-reference feature vector is generated from only one speaker not to be verified, (F) generating only a single operating phase similarity value from only the two comparisons in step (E), and (G) classifying the unknown person as verified when the single operating phase similarity value falls within a predetermined range of values.

View all claims

7 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method for recognizing a speaker in which a voice signal is spoken into a computer by a speaker and a feature vector is formed for the voice signal. The feature vector is compared to at least one stored reference feature vector and to at least one anti-feature vector. The reference feature vector is formed from a speech sample of a speaker to be verified. The anti-feature vector was formed from a speech sample that was spoken in by another speaker who is not the speaker to be verified. A 2-class classification is resolved by forming a similarity value and evaluating the similarity value on the basis of a predetermined range within which the similarity value must deviate from a predetermined value so that the voice signal can be classified as deriving from the speaker to be verified.

82 Citations

View as Search Results

13 Claims

1. A method for verifying a person on the basis of voice signals using a neural network, the method comprising the steps of:
- in a training phase, (A) generating and storing a reference feature vector from training phase voice signal generated by a speaker to be verified, (B) generating and storing an anti-reference feature vector from a voice anti-signal generated by a speaker not to be verified, (C) training the neural network using the reference feature vector and the anti-reference feature vector, thereby adapting weightings of the neural network to permit an optimum classification for a two-class problem;
  
  and in an operating phase, (D) generating a feature vector from an operating phase voice signal generated by an unknown person, who may or may not be the speaker to be verified, (E) submitting the feature vector, the stored reference feature vector and the stored anti-feature vector to the neural network for a only a single comparison between (a) the feature vector and the reference feature vector and only a single comparison between (b) the feature vector and the anti-reference feature vector, wherein said comparisons are capable of being made when said anti-reference feature vector is generated from only one speaker not to be verified, (F) generating only a single operating phase similarity value from only the two comparisons in step (E), and (G) classifying the unknown person as verified when the single operating phase similarity value falls within a predetermined range of values.
- View Dependent Claims (2, 3, 5, 6, 7)
- - 2. The method according to claim 1, wherein the neural network comprises a perceptron structure.
  - 3. The method according to claim 1, wherein:
5. The method according to claim 4, wherein:
- following each iteration of method steps A) through C) in claim 2, a similarity value is formed from respective comparisons between the feature vector and the reference feature vector, and between the feature vector and the anti-feature vector, a similarity of the feature vector to the reference feature vector and a similarity of the feature vector with the anti-feature vector being described by said similarity value;
  
  a new iteration is undertaken when the similarity value deviates by more than a prescribed range from a prescribed value; and
  
  the speaker is otherwise not classified as the speaker to be verified.
6. The method as in one of claim 1,4-5 wherein:
- at least two reference feature vectors or at least two anti-feature vectors are employed in the method; and
  
  the reference feature vectors or anti-feature vectors are formed by time distortion of a voice signal spoken by the speaker to be verified or, respectively, of a voice signal spoken by the speaker not to be verified.
7. The method as in one of claim 1,4-5 wherein individual, spoken letters or individual, spoken numbers are utilized as voice signals for the verification.

4. A method for verifying a person on the basis of voice signals using a neural network, the method comprising the steps of:
- in a training phase, (A) generating and storing a reference feature vector from training phase voice signal generated by a speaker to be verified, (B) generating and storing an anti-reference feature vector from a voice anti-signal generated by a speaker not to be verified, (C) training the neural network using the reference feature vector and the anti-reference feature vector, thereby adapting weightings of the neural network to permit an optimum classification for a two-class problem;
  
  and in an operating phase, (D) generating a feature vector from an operating phase voice signal generated by an unknown person, who may or may not be the speaker to be verified, (E) submitting the feature vector, the stored reference feature vector and the stored anti-feature vector to the neural network for a only a single comparison between (a) the feature vector and the reference feature vector and only a single comparison between (b) the feature vector and the anti-reference feature vector, wherein said comparisons are capable of being made when said anti-reference feature vector is generated from only one speaker not to be verified, (F) generating only a single operating phase similarity value from only the two comparisons in step (E), (G) repeating steps D-F for a plurality of operating phase voice signals generated by the unknown person; and
  
  (H) classifying the unknown person as verified when the result of a function combining the single operating phase similarity values falls within a predetermined range of values.

8. A telecommunications system in which the following method is undertaken for speaker verification when a voice signal is received from the telecommunications system:
- in a training phase, (A) generating and storing a reference feature vector from training phase voice signal generated by a speaker to be verified, (B) generating and storing an anti-reference feature vector from a voice anti-signal generated by a speaker not to be verified, (C) training the neural network using the reference feature vector and the anti-reference feature vector, thereby adapting weightings of the neural network to permit an optimum classification for a two-class problem;
  
  and in an operating phase, (D) generating a feature vector from an operating phase voice signal generated by an unknown person, who may or may not be the speaker to be verified, (E) submitting the feature vector, the stored reference feature vector and the stored anti-feature vector to the neural network for a only a single comparison between (a) the feature vector and the reference feature vector and only a single comparison between (b) the feature vector and the anti-reference feature vector, wherein said comparisons are capable of being made when said anti-reference feature vector is generated from only one speaker not to be verified, (F) generating only a single operating phase similarity value from only the two comparisons in step (E), and (G) classifying the unknown person as verified when the single operating phase similarity value falls within a predetermined range of values.

9. A telecommunications system in which the following method is undertaken for speaker verification when a voice signal is received from the telecommunications system:
- in a training phase, (A) generating and storing a reference feature vector from training phase voice signal generated by a speaker to be verified, (B) generating and storing an anti-reference feature vector from a voice anti-signal generated by a speaker not to be verified, (C) training the neural network using the reference feature vector and the anti-reference feature vector, thereby adapting weightings of the neural network to permit an optimum classification for a two-class problem;
  
  and in an operating phase, (D) generating a feature vector from an operating phase voice signal generated by an unknown person, who may or may not be the speaker to be verified, (E) submitting the feature vector, the stored reference feature vector and the stored anti-feature vector to the neural network for a only a single comparison between (a) the feature vector and the reference feature vector and only a single comparison between (b) the feature vector and the anti-reference feature vector, wherein said comparisons are capable of being made when said anti-reference feature vector is generated from only one speaker not to be verified, (F) generating only a single operating phase similarity value from only the two comparisons in step (E), (G) repeating steps D-F for a plurality of operating phase voice signals generated by the unknown person; and
  
  (H) classifying the unknown person as verified when the result of a function combining the single operating phase similarity values falls within a predetermined range of values.
- View Dependent Claims (10)
- - 10. The method according to claim 9, wherein, following each iteration of the method steps D)-F), a similarity value is formed from the respective comparisons, a similarity of the feature vector to the reference feature vector and a similarity of the feature vector with the anti-feature vector being described by said similarity value, and wherein a new iteration is undertaken when the similarity value deviates by more than a prescribed range from a prescribed value, and wherein the speaker is otherwise not classified as the speaker to be verified.

11. A mobile radiotelephone system in which the following method is undertaken for speaker verification when a voice signal is received from the telecommunications system:
- in a training phase, (A) generating and storing a reference feature vector from training phase voice signal generated by a speaker to be verified, (B) generating and storing an anti-reference feature vector from a voice anti-signal generated by a speaker not to be verified, (C) training the neural network using the reference feature vector and the anti-reference feature vector, thereby adapting weightings of the neural network to permit an optimum classification for a two-class problem;
  
  and in an operating phase, (D) generating a feature vector from an operating phase voice signal generated by an unknown person, who may or may not be the speaker to be verified, (E) submitting the feature vector, the stored reference feature vector and the stored anti-feature vector to the neural network for a only a single comparison between (a) the feature vector and the reference feature vector and only a single comparison between (b) the feature vector and the anti-reference feature vector, wherein said comparisons are capable of being made when said anti-reference feature vector is generated from only one speaker not to be verified, (F) generating only a single operating phase similarity value from only the two comparisons in step (E), and (G) classifying the unknown person as verified when the single operating phase similarity value falls within a predetermined range of values.

12. A mobile radiotelephone system in which the following method is undertaken for speaker verification when a voice signal is received from the telecommunications system:
- in a training phase, (A) generating and storing a reference feature vector from training phase voice signal generated by a speaker to be verified, (B) generating and storing an anti-reference feature vector from a voice anti-signal generated by a speaker not to be verified, (C) training the neural network using the reference feature vector and the anti-reference feature vector, thereby adapting weightings of the neural network to permit an optimum classification for a two-class problem;
  
  and in an operating phase, (D) generating a feature vector from an operating phase voice signal generated by an unknown person, who may or may not be the speaker to be verified, (E) submitting the feature vector, the stored reference feature vector and the stored anti-feature vector to the neural network for a only a single comparison between (a) the feature vector and the reference feature vector and only a single comparison between (b) the feature vector and the anti-reference feature vector, wherein said comparisons are capable of being made when said anti-reference feature vector is generated from only one speaker not to be verified, (F) generating only a single operating phase similarity value from only the two comparisons in step (E), (G) repeating steps D-F for a plurality of operating phase voice signals generated by the unknown person; and
  
  (H) classifying the unknown person as verified when the result of a function combining the single operating phase similarity values falls within a predetermined range of values.
- View Dependent Claims (13)
- - 13. The method according to claim 12, wherein:

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Lantiq Beteiligungs-GmbH & Company KG (Intel Corporation)
Original Assignee
Siemens AG
Inventors
Kaemmerer, Bernhard
Primary Examiner(s)
Korzuch, William
Assistant Examiner(s)
ARMSTRONG, ANGELA A

Application Number

US08/900,699
Time in Patent Office

1,530 Days
Field of Search

704/246, 704/247, 704/250, 704/249, 704/248, 704/232, 704/202, 704/273, 706/20
US Class Current

704/246
CPC Class Codes

G10L 17/06 Decision making techniques;...

G10L 25/30 using neural networks

Computer voice recognition method verifying speaker identity using speaker and non-speaker data

First Claim

7 Assignments

0 Petitions

Accused Products

Abstract

82 Citations

13 Claims

Specification

Use Cases

Quick Links

Others

Computer voice recognition method verifying speaker identity using speaker and non-speaker data

First Claim

7 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

82 Citations

13 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others