Method, apparatus and computer program product for providing voice conversion using temporal dynamic features
First Claim
1. A method comprising:
- extracting dynamic feature vectors from source speech;
applying a first conversion function to a signal including the extracted dynamic feature vectors to produce converted dynamic feature vectors, the first conversion function having been trained using at least dynamic feature data associated with training source speech and training target speech; and
producing converted speech based on an output of applying the first conversion function.
8 Assignments
0 Petitions
Accused Products
Abstract
An apparatus for providing voice conversion using temporal dynamic features includes a feature extractor and a transformation element. The feature extractor may be configured to extract dynamic feature vectors from source speech. The transformation element may be in communication with the feature extractor and configured to apply a first conversion function to a signal including the extracted dynamic feature vectors to produce converted dynamic feature vectors. The first conversion function may have been trained using at least dynamic feature data associated with training source speech and training target speech. The transformation element may be further configured to produce converted speech based on an output of applying the first conversion function.
-
Citations
23 Claims
-
1. A method comprising:
-
extracting dynamic feature vectors from source speech; applying a first conversion function to a signal including the extracted dynamic feature vectors to produce converted dynamic feature vectors, the first conversion function having been trained using at least dynamic feature data associated with training source speech and training target speech; and producing converted speech based on an output of applying the first conversion function. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A computer program product comprising at least one computer-readable storage medium having computer-readable program code portions stored therein, the computer-readable program code portions comprising:
-
a first executable portion for extracting dynamic feature vectors from source speech; a second executable portion for applying a first conversion function to a signal including the extracted dynamic feature vectors to produce converted dynamic feature vectors, the first conversion function having been trained using at least dynamic feature data associated with training source speech and training target speech; and a third executable portion for producing converted speech based on an output of applying the first conversion function. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. An apparatus comprising:
-
a feature extractor configured to extract dynamic feature vectors from source speech; and a transformation element in communication with the feature extractor and configured to apply a first conversion function to a signal including the extracted dynamic feature vectors to produce converted dynamic feature vectors, the first conversion function having been trained using at least dynamic feature data associated with training source speech and training target speech, and produce converted speech based on an output of applying the first conversion function. - View Dependent Claims (16, 17, 18, 19, 20, 21)
-
-
22. An apparatus comprising:
-
means for extracting dynamic feature vectors from source speech; means for applying a first conversion function to a signal including the extracted dynamic feature vectors to produce converted dynamic feature vectors, the first conversion function having been trained using at least dynamic feature data associated with training source speech and training target speech; and means for producing converted speech based on an output of applying the first conversion function. - View Dependent Claims (23)
-
Specification