Low-footprint adaptation and personalization for a deep neural network
First Claim
1. A method of adapting and personalizing a deep neural network (DNN) model for automatic speech recognition (ASR), comprising:
- receiving, by a computing device, at least one utterance comprising a plurality of speech features for one or more speakers from one or more ASR tasks;
applying, by the computing device, a decomposition process to two or more matrices in the DNN model;
in response to applying the decomposition process, adapting the DNN model to include a decomposed matrix that is generated from decomposition processing of the two or more matrices; and
exposing the adapted DNN model as a service.
3 Assignments
0 Petitions
Accused Products
Abstract
The adaptation and personalization of a deep neural network (DNN) model for automatic speech recognition is provided. An utterance which includes speech features for one or more speakers may be received in ASR tasks such as voice search or short message dictation. A decomposition approach may then be applied to an original matrix in the DNN model. In response to applying the decomposition approach, the original matrix may be converted into multiple new matrices which are smaller than the original matrix. A square matrix may then be added to the new matrices. Speaker-specific parameters may then be stored in the square matrix. The DNN model may then be adapted by updating the square matrix. This process may be applied to all of a number of original matrices in the DNN model. The adapted DNN model may include a reduced number of parameters than those received in the original DNN model.
87 Citations
20 Claims
-
1. A method of adapting and personalizing a deep neural network (DNN) model for automatic speech recognition (ASR), comprising:
-
receiving, by a computing device, at least one utterance comprising a plurality of speech features for one or more speakers from one or more ASR tasks; applying, by the computing device, a decomposition process to two or more matrices in the DNN model; in response to applying the decomposition process, adapting the DNN model to include a decomposed matrix that is generated from decomposition processing of the two or more matrices; and exposing the adapted DNN model as a service. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A system for adapting and personalizing a deep neural network (DNN) model for automatic speech recognition (ASR), comprising:
-
a memory for storing executable program code; and a processor, functionally coupled to the memory, the processor being responsive to computer-executable instructions contained in the program code and operative to; receive at least one utterance comprising a plurality of speech features for one or more speakers from one or more ASR tasks; determine an adapted DNN model from the DNN model, the DNN model comprising a plurality of unadapted matrices and the adapted DNN model comprising a plurality of adapted matrices; calculate a difference between the plurality adapted matrices and the plurality of unadapted matrices to determine a plurality of delta matrices; apply a decomposition process to the plurality of delta matrices; convert the plurality of delta matrices into a subset of one or more small matrices; and store the subset of the one or more small matrices within an adapted DNN model as a speaker dependent feature set. - View Dependent Claims (8, 9, 10, 11, 12)
-
-
13. A computer-readable storage medium storing computer executable instructions which, when executed by a computer, will cause computer to perform a method of adapting and personalizing a deep neural network (DNN) model for automatic speech recognition (ASR), the method comprising:
-
receiving a plurality of utterances, each of the plurality of utterances comprising a plurality of speech features for a plurality of speakers from one or more ASR tasks; applying a decomposition process to two or more matrices in the DNN model; and in response to applying the decomposition process, adapting the DNN model to include a decomposed matrix that is generated from decomposition processing of the two or more matrices. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20)
-
Specification