×

Dimensionality reduction of baum-welch statistics for speaker recognition

  • US 10,553,218 B2
  • Filed: 09/19/2017
  • Issued: 02/04/2020
  • Est. Priority Date: 09/19/2016
  • Status: Active Grant
First Claim
Patent Images

1. A speaker recognition apparatus comprising:

  • a computer configured to;

    extract audio features from a received recognition speech signal;

    generate first order Gaussian mixture model (GMM) statistics from the extracted audio features based on a universal background model that includes a plurality of speaker models;

    normalize the first order GMM statistics with regard to a duration of the received speech signal;

    train a deep neural network having a plurality of fully connected layers using a set of recognition speech signals; and

    execute the deep neural network having the plurality of fully connected layers to reduce a dimensionality of the normalized first order GMM statistics and output a voiceprint corresponding to the recognition speech signal, the fully connected layers of the deep neural network including;

    an input layer configured to receive the normalized first order GMM statistics;

    one or more sequentially arranged first hidden layers configured to receive coefficients from the input layer; and

    a last hidden layer arranged to receive coefficients from one hidden layer of the one or more first hidden layers, the last hidden layer having a dimension smaller than each of the one or more first hidden layers and configured to output the voiceprint corresponding to the recognition speech signal.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×