Adversarial teacher-student learning for unsupervised domain adaptation
First Claim
1. A method comprising:
- training, by one or more processors, a teacher model based on teacher speech data;
initializing, by the one or more processors, a student model with parameters obtained from the trained teacher model;
training, by the one or more processors, the student model with adversarial teacher-student learning based on the teacher speech data and student speech data, training the student model with adversarial teacher-student learning further comprising;
minimizing a teacher-student loss that measures a divergence of outputs between the teacher model and the student model;
minimizing a classifier condition loss with respect to parameters of a condition classifier, the classifier condition loss measuring errors caused by acoustic condition classification; and
maximizing the classifier condition loss with respect to parameters of a feature extractor; and
recognizing speech with the trained student model.
1 Assignment
0 Petitions
Accused Products
Abstract
Methods, systems, and computer programs are presented for training, with adversarial constraints, a student model for speech recognition based on a teacher model. One method includes operations for training a teacher model based on teacher speech data, initializing a student model with parameters obtained from the teacher model, and training the student model with adversarial teacher-student learning based on the teacher speech data and student speech data. Training the student model with adversarial teacher-student learning further includes minimizing a teacher-student loss that measures a divergence of outputs between the teacher model and the student model; minimizing a classifier condition loss with respect to parameters of a condition classifier; and maximizing the classifier condition loss with respect to parameters of a feature extractor. The classifier condition loss measures errors caused by acoustic condition classification. Further, speech is recognized with the trained student model.
-
Citations
20 Claims
-
1. A method comprising:
-
training, by one or more processors, a teacher model based on teacher speech data; initializing, by the one or more processors, a student model with parameters obtained from the trained teacher model; training, by the one or more processors, the student model with adversarial teacher-student learning based on the teacher speech data and student speech data, training the student model with adversarial teacher-student learning further comprising; minimizing a teacher-student loss that measures a divergence of outputs between the teacher model and the student model; minimizing a classifier condition loss with respect to parameters of a condition classifier, the classifier condition loss measuring errors caused by acoustic condition classification; and maximizing the classifier condition loss with respect to parameters of a feature extractor; and recognizing speech with the trained student model. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A system comprising:
-
a memory comprising instructions; and one or more computer processors, wherein the instructions, when executed by the one or more computer processors, cause the one or more computer processors to perform operations comprising; training a teacher model based on teacher speech data; initializing a student model with parameters obtained from the trained teacher model; training the student model with adversarial teacher-student learning based on the teacher speech data and student speech data, training the student model with adversarial teacher-student learning further comprising; minimizing a teacher-student loss that measures a divergence of outputs between the teacher model and the student model; minimizing a classifier condition loss with respect to parameters of a condition classifier, the classifier condition loss measuring errors caused by acoustic condition classification; and maximizing the classifier condition loss with respect to parameters of a feature extractor; and recognizing speech with the trained student model. - View Dependent Claims (12, 13, 14, 15)
-
-
16. A machine-readable storage medium including instructions that, when executed by a machine, cause the machine to perform operations comprising:
-
training a teacher model based on teacher speech data; initializing a student model with parameters obtained from the trained teacher model; training the student model with adversarial teacher-student learning based on the teacher speech data and student speech data, training the student model with adversarial teacher-student learning further comprising; minimizing a teacher-student loss that measures a divergence of outputs between the teacher model and the student model; minimizing a classifier condition loss with respect to parameters of a condition classifier, the classifier condition loss measuring errors caused by acoustic condition classification; and maximizing the classifier condition loss with respect to parameters of a feature extractor; and recognizing speech with the trained student model. - View Dependent Claims (17, 18, 19, 20)
-
Specification