Restructuring deep neural network acoustic models
First Claim
1. A method comprising:
- accessing a Deep Neural Network (DNN) model that includes a weight matrix and layers comprising;
an input layer;
a first hidden layer;
a second hidden layer, wherein the first and second hidden layers are coupled by the weight matrix comprising a plurality of values; and
an output layer;
determining whether the weight matrix is a weight matrix having at least as many parameters as a weight matrix immediately preceding the output layer;
upon determining that the weight matrix has at least as many parameters as a weight matrix immediately preceding the output layer, reducing a sparseness of the weight matrix in the DNN model, wherein reducing the sparseness comprises executing decomposition processing of the weight matrix to generate two smaller matrices from the weight matrix, wherein the decomposition processing comprises applying Singular Value Decomposition (SVD) to the weight matrix;
restructuring the DNN model based on the executed decomposition processing, wherein the restructuring further comprises modifying the plurality of values coupling the first and second hidden layers of the DNN model by replacing the weight matrix with the two smaller matrices;
providing the restructured DNN model;
receiving an utterance; and
processing the received utterance using the restructured DNN model.
2 Assignments
0 Petitions
Accused Products
Abstract
A Deep Neural Network (DNN) model used in an Automatic Speech Recognition (ASR) system is restructured. A restructured DNN model may include fewer parameters compared to the original DNN model. The restructured DNN model may include a monophone state output layer in addition to the senone output layer of the original DNN model. Singular value decomposition (SVD) can be applied to one or more weight matrices of the DNN model to reduce the size of the DNN Model. The output layer of the DNN model may be restructured to include monophone states in addition to the senones (tied triphone states) which are included in the original DNN model. When the monophone states are included in the restructured DNN model, the posteriors of monophone states are used to select a small part of senones to be evaluated.
-
Citations
20 Claims
-
1. A method comprising:
-
accessing a Deep Neural Network (DNN) model that includes a weight matrix and layers comprising; an input layer; a first hidden layer; a second hidden layer, wherein the first and second hidden layers are coupled by the weight matrix comprising a plurality of values; and an output layer; determining whether the weight matrix is a weight matrix having at least as many parameters as a weight matrix immediately preceding the output layer; upon determining that the weight matrix has at least as many parameters as a weight matrix immediately preceding the output layer, reducing a sparseness of the weight matrix in the DNN model, wherein reducing the sparseness comprises executing decomposition processing of the weight matrix to generate two smaller matrices from the weight matrix, wherein the decomposition processing comprises applying Singular Value Decomposition (SVD) to the weight matrix; restructuring the DNN model based on the executed decomposition processing, wherein the restructuring further comprises modifying the plurality of values coupling the first and second hidden layers of the DNN model by replacing the weight matrix with the two smaller matrices; providing the restructured DNN model; receiving an utterance; and processing the received utterance using the restructured DNN model. - View Dependent Claims (2, 3, 4, 5, 6, 17)
-
-
7. A computer storage device storing computer-executable instructions that, when executed by at least one processor, perform a method comprising:
creating a restructured Deep Neural Network (DNN) model from an original DNN model, wherein the creating further comprises; accessing the original DNN model, the original DNN model including a weight matrix and layers comprising; an input layer; a first hidden layer; a second hidden layer, wherein the first and second hidden layers are coupled by the weight matrix comprising a plurality of values; and an output layer determining whether the weight matrix is a weight matrix having at least as many parameters as a weight matrix immediately preceding the output layer; upon determining that the weight matrix has at least as many parameters as a weight matrix immediately preceding the output layer, executing decomposition processing of the weight matrix of the original DNN model to generate two smaller matrices from the weight matrix, wherein the decomposition processing comprises applying Singular Value Decomposition (SVD) to the weight matrix; and restructuring the original DNN model based on the executed decomposition processing, wherein the restructuring further comprises modifying the plurality of values coupling the first and second hidden layers of the DNN model by replacing the weight matrix with the two smaller matrices; receiving an utterance; and using the restructured DNN model to recognize the received utterance. - View Dependent Claims (8, 9, 10, 11, 12, 19)
-
13. A system comprising:
-
a processor and memory; an operating environment executing using the processor; and a model manager that is configured to perform actions comprising; accessing a Deep Neural Network (DNN) model that includes a weight matrix and layers comprising; an input layer; a first hidden layer; a second hidden layer, wherein the first and second hidden layers are coupled by the weight matrix comprising plurality of values; and an output layer; determining whether the weight matrix is a weight matrix having at least as many parameters as a weight matrix immediately preceding the output layer; upon determining that the weight matrix has at least as many parameters as a weight matrix immediately preceding the output layer, reducing a sparseness of the weight matrix in the DNN model, wherein reducing the sparseness comprises executing decomposition processing of the weight matrix to generate two smaller matrices from the weight matrix, wherein the decomposition processing comprises applying Singular Value Decomposition (SVD) to the weight matrix; restructuring the DNN model based on the executed decomposition processing, wherein the restructuring further comprises modifying the plurality of values coupling the first and second hidden layers of the DNN model by replacing the weight matrix with the two smaller matrices; providing the restructured DNN model; receiving an utterance; and processing the received utterance using the restructured DNN model. - View Dependent Claims (14, 15, 16, 18, 20)
-
Specification