Method and system for on-line unsupervised adaptation in speaker verification

US 6,804,647 B1
Filed: 03/13/2001
Issued: 10/12/2004
Est. Priority Date: 03/13/2001
Status: Expired due to Term

First Claim

Patent Images

1. A method comprising:

verifying the identity of a speaker during a speaker verification session based on a speaker model, including generating a confidence score representing a degree of confidence that the speaker is who the speaker claims to be;

determining whether a communication channel user by the speaker during the speaker verification session matches a base channel that was previously used by the speaker to enroll the speaker for speaker verification;

if the communication channel matches the base channel, then automatically updating the speaker model for use during subsequent speaker verification, based on vocal characteristics of the speaker on the communication channel; and

if the communication channel does not match the base channel, then automatically updating the speaker model for use during subsequent speaker verification by transforming the speaker model between channels, based on vocal characteristics of the speaker on the communication channel;

wherein said automatically updating the speaker model comprises updating the speaker model by a degree of aggressiveness that is based on the confidence score.

View all claims

5 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The present invention introduces a system and method for unsupervised, on-line, adaptation in speaker verification. In one embodiment, a method for adapting a speaker model to improve the verification of a speaker'"'"'s voice, comprises detecting a channel of a verification utterance; learning vocal characteristics of the speaker on the detected channel; and transforming the learned vocal characteristics of the speaker from the detected channel to the speaker model of a second channel.

Citations

22 Claims

1. A method comprising:
- verifying the identity of a speaker during a speaker verification session based on a speaker model, including generating a confidence score representing a degree of confidence that the speaker is who the speaker claims to be;
  
  determining whether a communication channel user by the speaker during the speaker verification session matches a base channel that was previously used by the speaker to enroll the speaker for speaker verification;
  
  if the communication channel matches the base channel, then automatically updating the speaker model for use during subsequent speaker verification, based on vocal characteristics of the speaker on the communication channel; and
  
  if the communication channel does not match the base channel, then automatically updating the speaker model for use during subsequent speaker verification by transforming the speaker model between channels, based on vocal characteristics of the speaker on the communication channel;
  
  wherein said automatically updating the speaker model comprises updating the speaker model by a degree of aggressiveness that is based on the confidence score.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. A method as recited in claim 1, further comprising:
3. A method as recited in claim 1, wherein said automatically updating the speaker model for use during subsequent speaker verification by using transformation between channels comprises:
- transforming the speaker model from the base channel to correspond to the communication channel;
  
  adapting the transformed speaker model based on the vocal characteristics of the speaker on the communication channel, and inverse transforming the adapted transformed speaker model to correspond to the base channel.
4. A method as recited in claim 1, further comprising detecting the gender of the speaker, wherein the speaker model is gender-specific.
5. A method as recited in claim 1, wherein the channel is one of a plurality of channels usable by the speaker for verification, each corresponding to a different type of communication device.
6. A method as recited in claim 1, wherein the size of the speaker model is not increased as a result of the speaker model being updated.

7. A method of performing unsupervised adaptation of a speaker model for use in speaker verification, the method comprising:
- detecting a communication channel by which an utterance of a speaker is received for speaker verification;
  
  learning vocal characteristics of the speaker on the detected channel;
  
  verifying the identity of the speaker by using a speaker model associated with the speaker;
  
  determining whether the detected communication channel matches a base channel previously used to enroll the speaker for verification;
  
  if the detected communication channel matches the base channel, then updating the speaker model for subsequent use in speaker verification, based on the learned vocal characteristics of the speaker; and
  
  if the detected communication channel does not match the base channel, then updating the speaker model for subsequent use in unsupervised speaker verification, by transforming the speaker model to correspond to the detected channel, adapting the transformed speaker model based on the learned vocal characteristics of the speaker, and inverse transforming the adapted transformed speaker model to correspond to the base channel.
- View Dependent Claims (8, 9, 10, 11, 12, 13, 14)
- - 8. A method as recited in claim 7, wherein the first and second speaker verification sessions are unsupervised.
  - 9. A method as recited in claim 7, further comprising learning the characteristics of the utterance on the detected channel prior to said updating the speaker model.
  - 10. A method as recited in claim 7, further comprising verifying the identity of the speaker during the first speaker verification session based on the speaker model, including generating a confidence score representing a degree of confidence that the speaker is who the speaker claims to be;
11. A method as recited in claim 7, further comprising detecting the gender of the caller, wherein the speaker model is gender-specific.
12. A method as recited in claim 7, wherein the channel is one of a plurality of channels usable by the speaker for verification, each corresponding to a different type of communication device.
13. A method as recited in claim 7, wherein said updating the speaker model comprises:
- updating a first speaker model based on vocal characteristics associated with a second speaker model; and
  
  discarding the second speaker model.
14. A method as recited in claim 7, wherein the size of the speaker model is not increased as a result of the speaker model being updated.

15. A processing system comprising:
- a processor; and
  
  a storage facility coupled to the processor and storing instructions which, when executed by the processor, cause the processing system to perform a process including;
  
  verifying the identity of a speaker during a speaker verification session based on a speaker model, including generating a confidence score representing a degree of confidence that the speaker is who the speaker claims to be;
  
  determining whether a communication channel user by the speaker during the speaker verification session matches a base channel that was previously used by the speaker to enroll the speaker for speaker verification;
  
  if the communication channel matches the base channel, then automatically updating the speaker model for use during subsequent speaker verification, based on vocal characteristics of the speaker on the communication channel; and
  
  if the communication channel does not match the base channel, then automatically updating the speaker model for use during subsequent speaker verification by using transformation between channels, based on vocal characteristics of the speaker on the communication channel;
  
  wherein said automatically updating the speaker model comprises updating the speaker model by a degree of aggressiveness that is based on the confidence score.
- View Dependent Claims (16, 17, 18, 19, 20)
- - 16. A processing system as recited in claim 15, wherein said process further comprises:
17. A processing system as recited in claim 15, wherein said automatically updating the speaker model for use during subsequent speaker verification by using transformation between channels comprises:
- transforming the speaker model from the base channel to correspond to the communication channel;
  
  adapting the transformed speaker model based on the vocal characteristics of the speaker on the communication channel, and inverse transforming the adapted transformed speaker model to correspond to the base channel.
18. A processing system as recited in claim 15, further comprising detecting the gender of the speaker, wherein the speaker model is gender-specific.
19. A processing system as recited in claim 15, wherein the channel is one of a plurality of channels usable by the speaker for verification, each corresponding to a different type of communication device.
20. A processing system as recited in claim 15, wherein the size of the speaker model is not increased as a result of the speaker model being updated.

21. A speaker verification system comprising:
- an automatic speech recognizer to recognize speech of a speaker received on a detected channel during a first unsupervised speaker verification session;
  
  an automatic speaker verifier to verify the identity of the speaker during the first unsupervised speaker verification session by using a speaker model associated with the speaker, wherein the speaker model corresponds to a base channel previously used by the speaker to enroll the speaker for speaker verification, and the detected channel is a channel other than the base channel; and
  
  an automatic adapter to update the speaker model for use during a subsequent unsupervised speaker verification session, based on vocal characteristics of the speaker on the detected channel, by automatically;
  
  transforming the speaker model from the base channel to correspond to the detected channel;
  
  updating the speaker model based on characteristics of the utterance on the detected channel; and
  
  inverse transforming the speaker model to correspond to the detected channel.

22. An apparatus comprising:
- means for verifying the identity of a speaker during a speaker verification session based on a speaker model, including generating a confidence score representing a degree of confidence that the speaker is who the speaker claims to be;
  
  means for determining whether a communication channel user by the speaker during the speaker verification session matches a base channel that was previously used by the speaker to enroll the speaker for speaker verification;
  
  means for automatically updating the speaker model for use during subsequent speaker verification by a degree of aggressiveness that is based on the confidence score, based on vocal characteristics of the speaker on the communication channel, if the communication channel matches the base channel; and
  
  means for automatically updating the speaker model for use during subsequent speaker verification by a degree of aggressiveness that is based on the confidence score by using transformation between channels, based on vocal characteristics of the speaker on the communication channel, if the communication channel does not match the base channel.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Nuance Communications, Inc. (Microsoft Corporation)
Original Assignee
Nuance Communications, Inc. (Microsoft Corporation)
Inventors
Heck, Larry Paul, Mirghafori, N. Nikki
Primary Examiner(s)
Chawan, Vijay
Assistant Examiner(s)
Opsasnick, Michael N.

Application Number

US09/808,074
Time in Patent Office

1,309 Days
Field of Search

704/246, 704/270.1, 704/273
US Class Current

704/246
CPC Class Codes

G10L 17/04 Training, enrolment or mode...

Method and system for on-line unsupervised adaptation in speaker verification

First Claim

5 Assignments

0 Petitions

Accused Products

Abstract

Citations

22 Claims

Specification

Solutions

Use Cases

Quick Links

Method and system for on-line unsupervised adaptation in speaker verification

First Claim

5 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

22 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links