×

Estimating speaker-specific affine transforms for neural network based speech recognition systems

  • US 9,378,735 B1
  • Filed: 12/19/2013
  • Issued: 06/28/2016
  • Est. Priority Date: 12/19/2013
  • Status: Active Grant
First Claim
Patent Images

1. A computer-implemented method comprising:

  • under control of one or more computing devices configured with specific computer-executable instructions,obtaining a Gaussian mixture model-based (“

    GMM-based”

    ) acoustic model;

    obtaining a neural network-based (“

    NN-based”

    ) acoustic model;

    receiving an audio signal comprising speech;

    computing a first sequence of feature vectors from the audio signal;

    computing a GMM-based transform using the GMM-based acoustic model and the first sequence of feature vectors, wherein the GMM-based transform comprises a first linear portion and a first bias portion;

    computing a second linear portion of a NN-based transform by minimizing a first least squares difference function, wherein the first least squares difference function comprises a difference between the second linear portion and the first linear portion;

    computing a second bias portion of the NN-based transform by minimizing a second least squares difference function, wherein the second least squares difference function comprises a difference between the second bias portion and the first bias portion;

    computing a second sequence of feature vectors from the audio signal;

    computing a third sequence of feature vectors by applying the second linear portion and the second bias portion of the NN-based transform to the second sequence of feature vectors;

    performing speech recognition using the third sequence of feature vectors and the NN-based acoustic model generate speech processing results; and

    determining, using the speech processing results, an action to perform.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×