Speaker verification system

US 20100268537A1
Filed: 04/17/2009
Published: 10/21/2010
Est. Priority Date: 04/17/2009
Status: Active Grant

First Claim

Patent Images

1. A text-independent speaker verification system comprising:

an enrollment stage that includes a digital signal processing block, a feature extraction block and a pattern matching block;

a threshold generation stage that includes a digital signal processing block, a feature extraction block and a threshold generation block; and

a verification stage that includes a digital signal processing block, a feature comparison block and a decision block;

wherein the pattern matching block incorporates template modeling with vector quantization; and

wherein the verification stage incorporates an adaptive decision verdict; and

wherein the speaker-verification system operates in a first mode of operation and thereafter in a second mode of operation, the first mode of operation comprising;

the enrollment stage for receiving speech from a known speaker, for processing the received speech, and for generating a codebook and speaker voice print values; and

the threshold generation stage for receiving additional speech from the known speaker, for processing the additional speech, for comparing the result to the codebook, and for generating and recording a threshold representing an acceptable deviation from the codebook; and

the second mode of operation comprising;

the verification stage for receiving speech, which is purported to be from the known speaker, for processing the speech, for comparing the result to the codebook and the threshold, and for determining whether the speaker is the known speaker or an imposter;

wherein in the case that the speaker is verified as the known speaker, the verification stage records a deviation between the speech received by the verification stage and the codebook, outputs a deviation message indicating the deviation of the additional speech, and calculates a new threshold from the recorded deviation and from other previously recorded deviations generated from the first and second modes of operation.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A text-independent speaker verification system utilizes mel frequency cepstral coefficients analysis in the feature extraction blocks, template modeling with vector quantization in the pattern matching blocks, an adaptive threshold and an adaptive decision verdict and is implemented in a stand-alone device using less powerful microprocessors and smaller data storage devices than used by comparable systems of the prior art.

Citations

20 Claims

1. A text-independent speaker verification system comprising:
- an enrollment stage that includes a digital signal processing block, a feature extraction block and a pattern matching block;
  
  a threshold generation stage that includes a digital signal processing block, a feature extraction block and a threshold generation block; and
  
  a verification stage that includes a digital signal processing block, a feature comparison block and a decision block;
  
  wherein the pattern matching block incorporates template modeling with vector quantization; and
  
  wherein the verification stage incorporates an adaptive decision verdict; and
  
  wherein the speaker-verification system operates in a first mode of operation and thereafter in a second mode of operation, the first mode of operation comprising;
  
  the enrollment stage for receiving speech from a known speaker, for processing the received speech, and for generating a codebook and speaker voice print values; and
  
  the threshold generation stage for receiving additional speech from the known speaker, for processing the additional speech, for comparing the result to the codebook, and for generating and recording a threshold representing an acceptable deviation from the codebook; and
  
  the second mode of operation comprising;
  
  the verification stage for receiving speech, which is purported to be from the known speaker, for processing the speech, for comparing the result to the codebook and the threshold, and for determining whether the speaker is the known speaker or an imposter;
  
  wherein in the case that the speaker is verified as the known speaker, the verification stage records a deviation between the speech received by the verification stage and the codebook, outputs a deviation message indicating the deviation of the additional speech, and calculates a new threshold from the recorded deviation and from other previously recorded deviations generated from the first and second modes of operation.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
- - 2. The text-independent speaker verification system of claim 1 wherein the pattern matching block utilizes an LBG algorithm to perform template modeling with vector quantization.
  - 3. The text-independent speaker verification system of claim 1 in which the codebook, speaker voice print values, threshold, and any recorded deviations calculated from the first mode of operation, the second mode of operation, or both modes of operation are stored in non-volatile memory in the speaker verification system.
  - 4. The text-independent speaker verification system of claim 2 in which the codebook, speaker voice print values, threshold, and any recorded deviations calculated from the first mode of operation, the second mode of operation, or both modes of operation are stored in non-volatile memory in the speaker verification system.
  - 5. The text-independent speaker verification system of claim 1 in which the codebook, speaker voice print values, threshold, and any recorded deviations calculated from the first mode of operation, the second mode of operation, or both modes of operation are stored in non-volatile memory in a smart card.
  - 6. The text-independent speaker verification system of claim 2 in which the codebook, speaker voice print values, threshold, and any recorded deviations calculated from the first mode of operation, the second mode of operation, or both modes of operation are stored in non-volatile memory in a smart card.
  - 7. The text-independent speaker verification system of claim 1 in which the codebook, speaker voice print values, threshold, and any recorded deviations calculated from the first mode of operation, the second mode of operation, or both modes of operation are stored on a combination of both non-volatile memory in the speaker verification system and non-volatile memory in a smart card.
  - 8. The text-independent speaker verification system of claim 2 in which the codebook, speaker voice print values, threshold, and any recorded deviations calculated from the first mode of operation, the second mode of operation, or both modes of operation are stored on a combination of both non-volatile memory in the speaker verification system and non-volatile memory in a smart card.

9. An apparatus for verifying a speaker'"'"'s identity, comprising:
- a microphone;
  
  an output device;
  
  an enrollment stage that includes a digital signal processing block, a feature extraction block and a pattern matching block;
  
  a threshold generation stage that includes a digital signal processing block, a feature extraction block and a threshold generation block; and
  
  a verification stage that includes a digital signal processing block, a feature comparison block and a decision block;
  
  wherein the pattern matching block incorporates template modeling using vector quantization; and
  
  wherein the verification stage incorporates an adaptive decision verdict; and
  
  wherein the speaker-verification system operates in a first mode of operation and thereafter in a second mode of operation, the first mode of operation comprising;
  
  the enrollment stage for receiving speech from a known speaker via the microphone, for processing the received speech, and for generating a codebook; and
  
  speaker voice print values; and
  
  the threshold generation stage for receiving additional speech from the known speaker, for processing the additional speech, for comparing the result to the codebook, and for generating and recording a threshold representing an acceptable deviation from the codebook;
  
  and the second mode of operation comprising;
  
  the verification stage for receiving speech, which is purported to be from the known speaker, for processing the speech, for comparing the result to the codebook and the threshold, and for determining whether the speaker is the known speaker or an imposter;
  
  wherein in the case that the speaker is verified as the known speaker, the verification stage records a deviation between the speech received by the verification stage and the codebook, calculates a new threshold from the recorded deviation and from other previously recorded deviations generated from the first and second modes of operation, and passes data regarding the deviation of the additional speech to the output device, with said output device indicating the deviation of the additional speech.
- View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
- - 10. The apparatus of claim 9 in which the pattern matching block utilizes an LBG algorithm to perform template modeling with vector quantization.
  - 11. The apparatus of claim 9 in which the codebook, speaker voice print values, threshold, and any recorded deviations calculated from the first mode of operation, the second mode of operation, or both modes of operation are stored in non-volatile memory in the speaker verification system.
  - 12. The apparatus of claim 10 in which the codebook, speaker voice print values, threshold, and any recorded deviations calculated from the first mode of operation, the second mode of operation, or both modes of operation are stored in non-volatile memory in the speaker verification system.
  - 13. The apparatus of claim 9 in which the codebook, speaker voice print values, threshold, and any recorded deviations calculated from the first mode of operation, the second mode of operation, or both modes of operation are stored in non-volatile memory in a smart card.
  - 14. The apparatus of claim 10 in which the codebook, speaker voice print values, threshold, and any recorded deviations calculated from the first mode of operation, the second mode of operation, or both modes of operation are stored in non-volatile memory in a smart card.
  - 15. The apparatus of claim 9 in which the codebook, speaker voice print values, threshold, and any recorded deviations calculated from the first mode of operation, the second mode of operation, or both modes of operation are stored on a combination of both non-volatile memory in the speaker verification system and non-volatile memory in a smart card.
  - 16. The apparatus of claim 10 in which the codebook, speaker voice print values, threshold, and any recorded deviations calculated from the first mode of operation, the second mode of operation, or both modes of operation are stored on a combination of both non-volatile memory in the speaker verification system and non-volatile memory in a smart card.

17. A method of verifying that test speech is from a known speaker, comprising:
- acquiring text-dependent enrollment speech from a known speaker, processing and analyzing the enrollment speech with vector quantization means to develop a codebook and speaker voice print values for that speaker;
  
  acquiring text-independent threshold generation speech from the known speaker;
  
  processing and analyzing the threshold generation speech with vector quantization means and comparing it to the codebook for that speaker to develop an initial value of an adaptive threshold for that speaker;
  
  acquiring additional text-independent test speech that is purported to be from the known speaker;
  
  processing and analyzing the test speech and comparing it to the codebook and threshold for that speaker to determine if the test speech is from the known speaker, and if so, storing the deviation between the test speech and codebook in a statistical array and analyzing the array to calculate a new value for the adaptive threshold for the known speaker to be used in the next test purported to be from the known speaker; and
  
  outputting a deviation message indicating whether the purported speaker is deemed to be the known speaker.
- View Dependent Claims (18)
- - 18. The method of claim 17, further comprising an adaptive decision verdict, wherein the method multiplies the adaptive threshold by a user-defined multiplier prior to comparing the test speech to the codebook and multiplied adaptive threshold to determine if the test speech is from the known speaker.

19. An adaptive threshold for a speaker verification system, wherein after being initially established for a known speaker, the adaptive threshold is automatically updated by the speaker verification system after every positive determination that test speech from a purported speaker is from the known speaker or is periodically updated based upon an independently determined system security requirement.

20. An adaptive decision verdict for a speaker verification system, wherein a user of the speaker verification system defines a multiplier for a threshold for a known speaker, and wherein the speaker verification system multiplies the threshold by the multiplier prior to comparing test speech purported to be from the known speaker to a codebook of the known speaker and the multiplied adaptive threshold of the known speaker to determine if the test speech is from the known speaker.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Saudi Arabian Oil Company (Government of Saudi Arabia)
Original Assignee
Saudi Arabian Oil Company (Government of Saudi Arabia)
Inventors
Al-Telmissani, Essam Abed

Granted Patent

US 8,209,174 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/246
CPC Class Codes

G10L 17/04   Training, enrolment or mode...

G10L 17/06   Decision making techniques;...

G10L 25/24   the extracted parameters be...

Speaker verification system

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Speaker verification system

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links