×

Method of and device for phone-based speaker recognition

  • US 6,618,702 B1
  • Filed: 06/14/2002
  • Issued: 09/09/2003
  • Est. Priority Date: 06/14/2002
  • Status: Expired due to Fees
First Claim
Patent Images

1. A device for phone-based speaker recognition, comprising:

  • at least one phone recognizer for converting input digitized voice signals into a time ordered stream of phones based on at least one linguistic characteristic, with each of said phone recognizers having a voice input, to receive said input digitized voice signals, and an output for transmitting said time ordered stream of phones;

    for each of said phone recognizers, a corresponding tokenizer, having an input for receiving said time ordered stream of phones, with each of said tokenizers creating a set containing phone n-grams and the number of times each of said phone n-grams occurred in said time ordered stream of phones, and having an output for transmitting said set containing phone n-grams and the number of times each of said phone n-grams occurred;

    for each of said tokenizers, a corresponding recognition scorer further comprising;

    (a) at least one speaker model scorer, each of said speaker model scorers receives the corresponding set containing phone n-grams and the number of times each of said phone n-grams occurred in said time ordered stream of phones and computes a speaker log-likelihood score for each of said phone n-grams in said set containing phone n-grams and the number of times each of said phone n-grams occurred in said time ordered stream of phones using a corresponding speaker model which contains the number of occurrences of each of said phone n-grams that occurred in a speaker training speech set collected from a particular speaker;

    (b) a background model scorer for computing a background log-likelihood score for each of said phone n-grams in said set containing phone n-grams and the number of times each of said phone n-grams occurred in said time ordered stream of phones using a corresponding backgrounds model which contains the number of occurrences of each of said phone n-grams that occurred in background training speech set collected from many speakers, excluding all of said particular speakers; and

    (c) for each of said speaker model scorers, a ratio scorer that produces a speaker log-likelihood ratio from said speaker log-likelihood score and said background log-likelihood score;

    for each of said recognition scorers, a corresponding fusion scorer which combines all of said corresponding speaker log-likelihood ratios from said corresponding ratio scorers to produce a single speaker score; and

    a speaker selector which evaluates all of said single speaker scores to determine a speaker identity for the speaker of said input digitized voice signals.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×