SOFT LABEL GENERATION FOR KNOWLEDGE DISTILLATION
First Claim
1. A computer-implemented method for generating soft labels for training, the method comprising:
- preparing a teacher model having a teacher side class set;
obtaining a collection of class pairs for respective data units, each class pair including classes labeled to a corresponding data unit from among the teacher side class set and from among a student side class set different from the teacher side class set;
feeding a training input into the teacher model to obtain a set of outputs for the teacher side class set; and
calculating a set of soft labels for the student side class set from the set of the outputs by using, for each member of the student side class set, at least an output obtained for a class within a subset of the teacher side class set having relevance to the member of the student side class set, based at least in part on observations in the collection of the class pairs.
1 Assignment
0 Petitions
Accused Products
Abstract
A technique for generating soft labels for training is disclosed. In the method, a teacher model having a teacher side class set is prepared. A collection of class pairs for respective data units is obtained. Each class pair includes classes labelled to a corresponding data unit from among the teacher side class set and from among a student side class set that is different from the teacher side class set. A training input is fed into the teacher model to obtain a set of outputs for the teacher side class set. A set of soft labels for the student side class set is calculated from the set of the outputs by using, for each member of the student side class set, at least an output obtained for a class within a subset of the teacher side class set having relevance to the member of the student side class set, based at least in part on observations in the collection of the class pairs.
30 Citations
20 Claims
-
1. A computer-implemented method for generating soft labels for training, the method comprising:
-
preparing a teacher model having a teacher side class set; obtaining a collection of class pairs for respective data units, each class pair including classes labeled to a corresponding data unit from among the teacher side class set and from among a student side class set different from the teacher side class set; feeding a training input into the teacher model to obtain a set of outputs for the teacher side class set; and calculating a set of soft labels for the student side class set from the set of the outputs by using, for each member of the student side class set, at least an output obtained for a class within a subset of the teacher side class set having relevance to the member of the student side class set, based at least in part on observations in the collection of the class pairs. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A computer system for generating soft labels for training, the computer system comprising:
-
a memory storing program instructions; a processing circuitry in communications with the memory for executing the program instructions, wherein the processing circuitry is configured to; prepare a teacher model having a teacher side class set; obtain a collection of class pairs for respective data units, wherein each class pair includes classes labelled to a corresponding data unit from among the teacher side class set and from among a student side class set different from the teacher side class set; feed a training input into the teacher model to obtain a set of outputs for the teacher side class set; and calculate a set of soft labels for the student side class set from the set of the outputs by using, for each member of the student side class set, at least an output obtained for a class within a subset of the teacher side class set having relevance to the member of the student side class set, based at least in part on observations in the collection of the class pairs. - View Dependent Claims (15, 16, 17)
-
-
18. A computer program product for generating soft labels for training, the computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by a computer to cause the computer to perform a method comprising:
-
preparing a teacher model having a teacher side class set; obtaining a collection of class pairs for respective data units, each class pair including classes labelled to a corresponding data unit from among the teacher side class set and from among a student side class set different from the teacher side class set; feeding a training input into the teacher model to obtain a set of outputs for the teacher side class set; and calculating a set of soft labels for the student side class set from the set of the outputs by using, for each member of the student side class set, at least an output obtained for a class within a subset of the teacher side class set having relevance to the member of the student side class set, based at least in part on observations in the collection of the class pairs. - View Dependent Claims (19, 20)
-
Specification