Speaker adaption method and apparatus, and storage medium
First Claim
Patent Images
1. A speaker adaption method, comprising:
- acquiring speech data of a reference speaker;
performing a training according to the speech data of the reference speaker to acquire a batch normalization (BN) network comprising a global speech parameter and a speech recognition model comprising the global speech parameter;
acquiring first speech data of a target speaker;
inputting the first speech data to the BN network to acquire a speech parameter of the target speaker, and replacing the global speech parameter with the speech parameter of the target speaker to acquire a speech recognition model comprising the speech parameter of the target speaker.
1 Assignment
0 Petitions
Accused Products
Abstract
A speaker adaption method and a speaker adaption apparatus, a device and a storage medium are provided. The method includes: acquiring first speech data of a target speaker; inputting the first speech data to a pre-trained batch normalization (BN) network to be subjected to an adaptive training to acquire a speech recognition model including a speech parameter of the target speaker.
11 Citations
12 Claims
-
1. A speaker adaption method, comprising:
-
acquiring speech data of a reference speaker; performing a training according to the speech data of the reference speaker to acquire a batch normalization (BN) network comprising a global speech parameter and a speech recognition model comprising the global speech parameter; acquiring first speech data of a target speaker; inputting the first speech data to the BN network to acquire a speech parameter of the target speaker, and replacing the global speech parameter with the speech parameter of the target speaker to acquire a speech recognition model comprising the speech parameter of the target speaker. - View Dependent Claims (2, 3, 4)
-
-
5. A speaker adaption apparatus, comprising:
-
one or more processors; a memory; one or more software modules stored in the memory and executable by the one or more processors, and comprising; a speech data acquiring module configured to acquire speech data of a reference speaker; and a model training module configured to perform a training according to the speech data of the reference speaker to acquire a batch normalization (BN) network comprising a global speech parameter and a speech recognition model comprising the global speech parameter, wherein; the speech data acquiring module is further configured to acquire first speech data of a target speaker; and the model training module is further configured to input the first speech data to the BN network to acquire a speech parameter of the target speaker, replace the global speech parameter with the speech parameter of the target speaker to acquire a speech recognition model comprising 6 the speech parameter of the target speaker. - View Dependent Claims (6, 7, 8)
-
-
9. A computer-readable storage medium having stored therein computer programs that, when executed by a processor of a terminal, cause the terminal to perform a speaker adaption method, the method comprising:
-
acquiring speech data of a reference speaker; performing a training according to the speech data of the reference speaker to acquire a batch normalization (BN) network comprising a global speech parameter and a speech recognition model comprising the global speech parameter; acquiring first speech data of a target speaker; inputting the first speech data to the BN network to acquire a speech parameter of the target speaker, and replacing the global speech parameter with the speech parameter of the target speaker to acquire a speech recognition model the speech parameter of the target speaker. - View Dependent Claims (10, 11, 12)
-
Specification