MULTI-TASK LEARNING USING KNOWLEDGE DISTILLATION
First Claim
1. A computer implemented method comprising:
- obtaining a respective set of training data for each of a plurality of machine learning tasks;
for each of the machine learning tasks, configuring a respective teacher machine learning model to perform the machine learning task by training the teacher machine learning model on the training data for the task; and
training a single student machine learning model to perform all of the plurality of machine learning tasks using (i) the configured teacher machine learning models, and (ii) the obtained training data.
1 Assignment
0 Petitions
Accused Products
Abstract
Methods, systems, and apparatus, including computer programs encoded on computer storage media for performing multi-task learning. In one method a system obtains a respective set of training data for each of multiple machine learning tasks. For each of the machine learning tasks, the system configures a respective teacher machine learning model to perform the machine learning task by training the teacher machine learning model on the training data. The system trains a single student machine learning model to perform the multiple machine learning tasks using (i) the configured teacher machine learning models, and (ii) the obtained training data.
-
Citations
20 Claims
-
1. A computer implemented method comprising:
-
obtaining a respective set of training data for each of a plurality of machine learning tasks; for each of the machine learning tasks, configuring a respective teacher machine learning model to perform the machine learning task by training the teacher machine learning model on the training data for the task; and training a single student machine learning model to perform all of the plurality of machine learning tasks using (i) the configured teacher machine learning models, and (ii) the obtained training data. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
-
-
15. A system comprising one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising:
-
obtaining a respective set of training data for each of a plurality of machine learning tasks; for each of the machine learning tasks, configuring a respective teacher machine learning model to perform the machine learning task by training the teacher machine learning model on the training data for the task; and training a single student machine learning model to perform all of the plurality of machine learning tasks using (i) the configured teacher machine learning models, and (ii) the obtained training data. - View Dependent Claims (16, 17, 18, 19)
-
-
20. One or more computer storage media encoded with instructions that, when executed by one or more computers, cause the one or more computers to perform operations comprising:
-
obtaining a respective set of training data for each of a plurality of machine learning tasks; for each of the machine learning tasks, configuring a respective teacher machine learning model to perform the machine learning task by training the teacher machine learning model on the training data for the task; and training a single student machine learning model to perform all of the plurality of machine learning tasks using (i) the configured teacher machine learning models, and (ii) the obtained training data.
-
Specification