Method and system for dual scoring for text-dependent speaker verification
First Claim
1. A speaker verification method comprising:
- receiving an utterance from a speaker by an audio receiving device;
determining a text-independent speaker verification score in response to the utterance using a processor coupled to the audio receiving device to determine the text-independent speaker verification score in response to a speaker-dependent text-independent Gaussian Mixture Model (GMM) of the utterance;
determining a text-dependent speaker verification score in response to the utterance using the processor to determine the text-dependent speaker verification score in response to a continuous density Hidden Markov Model (HMM) of the utterance aligned by a Viterbi decoding;
determining a Universal Background Model (UBM)-independent speaker-dependent normalized score in response to a relationship between the text-dependent speaker verification score and the text-independent speaker verification score using the processor, the relationship being based on a difference between the text-dependent speaker verification score and the text-independent speaker verification score; and
determining speaker verification in response to the UBM-independent speaker-normalized score.
1 Assignment
0 Petitions
Accused Products
Abstract
Embodiments of systems and methods for speaker verification are provided. In various embodiments, a method includes receiving an utterance from a speaker and determining a text-independent speaker verification score and a text-dependent speaker verification score in response to the utterance. Various embodiments include a system for speaker verification, the system comprising an audio receiving device for receiving an utterance from a speaker and converting the utterance to an utterance signal, and a processor coupled to the audio receiving device for determining speaker verification in response to the utterance signal, wherein the processor determines speaker verification in response to a UBM-independent speaker-normalized score.
20 Citations
16 Claims
-
1. A speaker verification method comprising:
-
receiving an utterance from a speaker by an audio receiving device; determining a text-independent speaker verification score in response to the utterance using a processor coupled to the audio receiving device to determine the text-independent speaker verification score in response to a speaker-dependent text-independent Gaussian Mixture Model (GMM) of the utterance; determining a text-dependent speaker verification score in response to the utterance using the processor to determine the text-dependent speaker verification score in response to a continuous density Hidden Markov Model (HMM) of the utterance aligned by a Viterbi decoding; determining a Universal Background Model (UBM)-independent speaker-dependent normalized score in response to a relationship between the text-dependent speaker verification score and the text-independent speaker verification score using the processor, the relationship being based on a difference between the text-dependent speaker verification score and the text-independent speaker verification score; and determining speaker verification in response to the UBM-independent speaker-normalized score. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A Universal Background Model (UBM) independent speaker verification method comprising:
-
receiving an utterance from a speaker by an audio receiving device; determining a text-independent speaker verification score in response to the utterance using a processor coupled to the audio receiving device; determining a text-dependent speaker verification score in response to the utterance using the processor; determining a UBM-independent speaker-normalized score in response to a difference between the text-independent speaker verification score and the text-dependent speaker verification score using the processor; and determining speaker verification in response to the UBM-independent speaker-normalized score. - View Dependent Claims (9, 10, 11)
-
-
12. A dual-scoring text-dependent speaker verification method comprising:
-
receiving a plurality of test utterances by an audio receiving device; determining a text-independent speaker verification score in response to each of the plurality of utterances using a processor coupled to the audio receiving device; determining a text-dependent speaker verification score in response to each of the plurality of utterances using the processor; determining a Universal Background Model (UBM)-independent speaker-normalized score in response to a relationship between the text-dependent speaker verification score and the text-independent speaker verification score using the processor, the relationship being based on a difference between the text-dependent speaker verification score and the text-independent speaker verification score; mapping the UBM-independent speaker-normalized score and the text-dependent speaker verification score for each of the plurality of utterances into a two-dimensional score space in response to a score accept threshold and a score reject threshold; splitting the two-dimensional score space into three clusters, the three clusters corresponding to accept scores, indecisive scores and reject scores; and defining a binary decision tree for speaker verification confidence score generation by identifying a logistic function at each node of the binary decision tree. - View Dependent Claims (13, 14)
-
-
15. A system for speaker verification comprising:
-
an audio receiving device for receiving an utterance from a speaker and converting the utterance to an utterance signal; and a processor coupled to the audio receiving device for determining speaker verification in response to the utterance signal, wherein the processor determines speaker verification in response to a Universal Background Model (UBM)-independent speaker-normalized score by determining a text-independent speaker verification score in response to the utterance signal, the text-independent speaker verification score determined in response to a speaker-dependent text-independent Gaussian Mixture Model (GMM) of the utterance; determining a text-dependent speaker verification score in response to the utterance signal, the text-dependent speaker verification score determined in response to a continuous density Hidden Markov Model (HMM) of the utterance signal aligned by a Viterbi decoding; and determining the UBM-independent speaker-normalized score in response to a relationship between the text-dependent speaker verification score and the text-independent speaker verification score, the relationship being based on a difference between the text-independent speaker verification score and the text-dependent speaker verification score. - View Dependent Claims (16)
-
Specification