×

Full-sequence training of deep structures for speech recognition

  • US 9,031,844 B2
  • Filed: 09/21/2010
  • Issued: 05/12/2015
  • Est. Priority Date: 09/21/2010
  • Status: Active Grant
First Claim
Patent Images

1. A method comprising the following computer-executable acts:

  • accessing a deep belief network (DBN) retained in computer-readable data storage, wherein the DBN comprises;

    a plurality of stacked hidden layers, each hidden layer comprises a respective plurality of stochastic units, each stochastic unit in each layer connected to stochastic units in an adjacent hidden layer of the DBN by way of connections, the connections assigned weights learned during a pretraining procedure; and

    a linear-chain conditional random field (CRF), the CRF comprises;

    a hidden layer that comprises a plurality of stochastic units; and

    a plurality of output units that are representative of output states, each state in the output states being one of a phone or senone, the plurality of stochastic units connected to the plurality of output units by way of second connections, the second connections having weights learned during the pretraining procedure, the output units have transition probabilities corresponding thereto that are indicative of probabilities of transitioning between output states represented by the output units; and

    jointly optimizing the weights assigned to the connections, the weights assigned to the second connections, the transition probabilities, and language model scores of the DBN based upon training data, wherein a processor performs the jointly optimizing of the weights.

View all claims
  • 3 Assignments
Timeline View
Assignment View
    ×
    ×