Method of speech segment selection for concatenative synthesis based on prosody-aligned distance measure
First Claim
1. A method of speech segment selection for use in constructing a concatenative synthesizer'"'"'s database based on prosody-aligned distance measure, comprising the steps of:
- (A) segmenting speech stored in a speech corpus, which is recorded in advance into a plurality of speech segments according to a unit type, wherein each of the speech segments has its prosody;
(B) locating pitch marks for each of the speech segments;
(C) selecting one of the speech segments according to the unit type as a source segment and the remaining speech segments as target segments, and performing a prosody alignment between the source segment and each of the target segments by modifying the prosody of the source segment with a respective prosody of each of the target segments, so as to obtain a prosody-aligned source segment with respect to each of the target segments, wherein the pitch marks of the prosody-aligned source segment are time-aligned and pitch-aligned with the pitch marks of each of the target segments;
(D) respectively measuring distortion between the prosody-aligned source segment and each of the target segments to obtain a distance between the prosody-aligned source segment and each of the target segments, and to obtain an average distance for the prosody-aligned source segment with respect to each of the target segments; and
(E) selecting at least one speech segment previously selected as the source segment with a relatively small average distance to be used as a synthetic speech unit of the unit type for constructing the synthesizer'"'"'s database.
1 Assignment
0 Petitions
Accused Products
Abstract
A method of speech segment selection for concatenative synthesis based on prosody-aligned distance measure is disclosed. This method is based on comparison of speech segments segmented from a speech corpus, wherein speech segments are fully prosody-aligned to each other before distortion measure. With prosody alignment embedded in selection process, distortion resulting from possible prosody modification in synthesis could be taken into account objectively in selection phase. In order to carry out the purpose of the present invention, automatic segmentation, pitch marking and PSOLA method work together for prosody alignment. Two distortion measures, MFCC and PSQM are used for comparing two prosody-aligned segments of speech because of human perceptual consideration.
15 Citations
11 Claims
-
1. A method of speech segment selection for use in constructing a concatenative synthesizer'"'"'s database based on prosody-aligned distance measure, comprising the steps of:
-
(A) segmenting speech stored in a speech corpus, which is recorded in advance into a plurality of speech segments according to a unit type, wherein each of the speech segments has its prosody; (B) locating pitch marks for each of the speech segments; (C) selecting one of the speech segments according to the unit type as a source segment and the remaining speech segments as target segments, and performing a prosody alignment between the source segment and each of the target segments by modifying the prosody of the source segment with a respective prosody of each of the target segments, so as to obtain a prosody-aligned source segment with respect to each of the target segments, wherein the pitch marks of the prosody-aligned source segment are time-aligned and pitch-aligned with the pitch marks of each of the target segments; (D) respectively measuring distortion between the prosody-aligned source segment and each of the target segments to obtain a distance between the prosody-aligned source segment and each of the target segments, and to obtain an average distance for the prosody-aligned source segment with respect to each of the target segments; and (E) selecting at least one speech segment previously selected as the source segment with a relatively small average distance to be used as a synthetic speech unit of the unit type for constructing the synthesizer'"'"'s database. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
Specification