MODEL LEARNING DEVICE, METHOD THEREFOR, AND PROGRAM
First Claim
1. A model learning device, comprisingan initial value setting part that uses a parameter of a learned first model including a neural network to set a parameter of a second model including a neural network having a same network structure as the first model;
- a first output probability distribution calculating part that calculates a first output probability distribution including a distribution of an output probability of each unit on an output layer, using features obtained from learning data and the first model;
a second output probability distribution calculating part that calculates a second output probability distribution including a distribution of an output probability of each unit on the output layer, using features obtained from the learning data and the second model; and
a modified model update part that calculates a second loss function from correct information corresponding to the learning data and from the second output probability distribution, calculates a cross entropy between the first output probability distribution and the second output probability distribution, obtains a weighted sum of the second loss function and the cross entropy, and updates the parameter of the second model so as to reduce the weighted sum.
1 Assignment
0 Petitions
Accused Products
Abstract
A model learning device comprises: an initial value setting part that uses a parameter of a learned first model including a neural network to set a parameter of a second model including a neural network having a same network structure as the first model; a first output probability distribution calculating part that calculates a first output probability distribution including a distribution of an output probability of each unit on an output layer, using learning features and the first model; a second output probability distribution calculating part that calculates a second output probability distribution including a distribution of an output probability of each unit on the output layer, using learning features and the second model; and a modified model update part that obtains a weighted sum of a second loss function calculated from correct information and from the second output probability distribution, and a cross entropy between the first output probability distribution and the second output probability distribution, and updates the parameter of the second model so as to reduce the weighted sum.
17 Citations
9 Claims
-
1. A model learning device, comprising
an initial value setting part that uses a parameter of a learned first model including a neural network to set a parameter of a second model including a neural network having a same network structure as the first model; -
a first output probability distribution calculating part that calculates a first output probability distribution including a distribution of an output probability of each unit on an output layer, using features obtained from learning data and the first model; a second output probability distribution calculating part that calculates a second output probability distribution including a distribution of an output probability of each unit on the output layer, using features obtained from the learning data and the second model; and a modified model update part that calculates a second loss function from correct information corresponding to the learning data and from the second output probability distribution, calculates a cross entropy between the first output probability distribution and the second output probability distribution, obtains a weighted sum of the second loss function and the cross entropy, and updates the parameter of the second model so as to reduce the weighted sum. - View Dependent Claims (5, 8, 9)
-
-
2. A model learning device, comprising:
-
an initial value setting part that uses a parameter of a learned first acoustic model including a neural network to set a parameter of a second acoustic model including a neural network having a same network structure as the first acoustic model; a first output probability distribution calculating part that calculates a first output probability distribution including a distribution of an output probability of each unit on an output layer, using features obtained from a learning acoustic signal and the first acoustic model; a second output probability distribution calculating part that calculates a second output probability distribution including a distribution of an output probability of each unit on the output layer, using features obtained from a learning acoustic signal and the second acoustic model; and a modified model update part that calculates a second loss function from a correct unit number corresponding to the learning acoustic signal and from the second output probability distribution, calculates a cross entropy between the first output probability distribution and the second output probability distribution, obtains a weighted sum of the second loss function and the cross entropy, and updates the parameter of the second acoustic model so as to reduce the weighted sum.
-
-
3. A model learning device, comprising:
-
an initial value setting part that uses a parameter of a learned first language model including a neural network to set a parameter of a second language model including a neural network having a same network structure as the first language model; a first output probability distribution calculating part that calculates a first output probability distribution including a distribution of an output probability of each unit on an output layer, using a word history that is a word string obtained from learning text data, and the first language model; a second output probability distribution calculating part that calculates a second output probability distribution including a distribution of an output probability of each unit on the output layer, using a word history that is a word string obtained from learning text data, and the second language model; and a modified model update part that calculates a second loss function from a correct word corresponding to the learning word history and from the second output probability distribution, calculates a cross entropy between the first output probability distribution and the second output probability distribution, obtains a weighted sum of the second loss function and the cross entropy, and updates the parameter of the second language model so as to reduce the weighted sum. - View Dependent Claims (4)
-
-
6. A model learning method, comprising:
-
an initial value setting step of using a parameter of a learned first model including a neural network to set a parameter of a second model including a neural network having a same network structure as the first model; a first output probability distribution calculating step of calculating a first output probability distribution including a distribution of an output probability of each unit on an output layer, using features obtained from learning data and the first model; a second output probability distribution calculating step of calculating a second output probability distribution including a distribution of an output probability of each unit on the output layer, using features obtained from the learning data and the second model; and a modified model update step of calculating a second loss function from correct information corresponding to the learning data and from the second output probability distribution, of calculating a cross entropy between the first output probability distribution and the second output probability distribution, of obtaining a weighted sum of the second loss function and the cross entropy, and of updating the parameter of the second model so as to reduce the weighted sum. - View Dependent Claims (7)
-
Specification