SPEECH ENHANCEMENT METHOD, SPEECH RECOGNITION METHOD, CLUSTERING METHOD AND DEVICE
First Claim
1. A speech enhancement method, comprising:
- selecting a feature vector clustering center best matched with the feature vector of a first frame speech part contained in a test speech from feature vector clustering centers obtained by training by a selection unit;
performing direct to the feature vectors of other frame speech parts contained in the test speech;
selecting a feature vector clustering center best matched with the feature vector of the speech part from a feature vector clustering center best matched with the feature vector of a previous frame speech part to the speech part and obtained by training and a feature vector clustering center adjacent to the feature vector clustering center best matched with the feature vector of the previous frame speech part, wherein a set formed by each of the feature vector clustering centers obtained by training and at least one adjacent feature vector clustering center thereof has an ability to describe speech continuity; and
reconstructing the feature vector of the test speech according to the feature vectors of each frame speech part contained in the test speech and the selected feature vector clustering center by a reconstruction unit; and
performing speech recognition on a the reconstructed feature vector of the test speech by a speech recognition.
1 Assignment
0 Petitions
Accused Products
Abstract
The present invention discloses a speech enhancement method, a speech recognition method, a clustering method and a device. The method includes: selecting a feature vector clustering center best matched with the feature vector of a first frame speech part of a test speech; performing direct to the feature vectors of other frame speech parts contained in the test speech: selecting a feature vector clustering center best matched with the feature vector of the speech part from a feature vector clustering center best matched with the feature vector of a previous frame speech part to the speech part and a feature vector clustering center adjacent to the feature vector clustering center best matched with the feature vector of the previous frame speech part; and reconstructing the feature vector of the test speech according to the feature vectors of each frame speech part contained in the test speech and the selected feature vector clustering center. Because a feature capable of representing speech continuity is utilized during speech enhancement, the present invention can achieve a better speech enhancement effect relative to a traditional speech enhancement model in the prior art.
-
Citations
18 Claims
-
1. A speech enhancement method, comprising:
-
selecting a feature vector clustering center best matched with the feature vector of a first frame speech part contained in a test speech from feature vector clustering centers obtained by training by a selection unit; performing direct to the feature vectors of other frame speech parts contained in the test speech;
selecting a feature vector clustering center best matched with the feature vector of the speech part from a feature vector clustering center best matched with the feature vector of a previous frame speech part to the speech part and obtained by training and a feature vector clustering center adjacent to the feature vector clustering center best matched with the feature vector of the previous frame speech part, wherein a set formed by each of the feature vector clustering centers obtained by training and at least one adjacent feature vector clustering center thereof has an ability to describe speech continuity; andreconstructing the feature vector of the test speech according to the feature vectors of each frame speech part contained in the test speech and the selected feature vector clustering center by a reconstruction unit; and performing speech recognition on a the reconstructed feature vector of the test speech by a speech recognition. - View Dependent Claims (2, 3, 4)
-
-
5-12. -12. (canceled)
-
13. An electrical apparatus, comprising:
-
a processor; and an memory for storing commands executed by the processor; wherein the processor is configured to; selecting a feature vector clustering center best matched with the feature vector of a first frame speech part contained in a test speech from feature vector clustering centers obtained by training;
performing direct to the feature vectors of other frame speech parts contained in the test speech;
selecting a feature vector clustering center best matched with the feature vector of the speech part from a feature vector clustering center best matched with the feature vector of a previous frame speech part to the speech part and obtained by training and a feature vector clustering center adjacent to the feature vector clustering center best matched with the feature vector of the previous frame speech part, wherein a set formed by each of the feature vector clustering centers obtained by training and at least one adjacent feature vector clustering center thereof has an ability to describe speech continuity;
reconstructing the feature vector of the test speech according to the feature vectors of each frame speech part contained in the test speech and the selected feature vector clustering center; and
performing speech recognition on the reconstructed feature vector of the test speech. - View Dependent Claims (15, 16, 17)
-
-
14. (canceled)
-
18. A non-transitory computer storage media having computer-executable instructions stored thereon which, when executed by a computer, cause the computer to:
-
respectively extracting feature vector samples from each frame speech part contained in a training corpus;
determining the distribution information of the feature vector samples in a multidimensional space;
determining initial clustering centers according to the distribution information;
performing iterative clustering on each initial clustering center to obtain undetermined clustering centers according to the similarity between the feature vector samples and each initial clustering center; and
performing iterative clustering on the undetermined clustering centers to obtain a feature vector clustering center according to given iterative clustering rules;wherein, the given iterative clustering rules comprise;
performing iterative clustering on the undetermined clustering centers according to the feature vectors of each speech part of the training corpus;
the feature vector pursuant when performing single iterative clustering on the undetermined clustering centers being the feature vector of single speech part in the training corpus; and
the feature vectors respectively pursuant when performing every two adjacent iterative clustering on the undetermined clustering centers being the feature vectors of adjacent speech parts in the training corpus.
-
Specification