Automatic segmentation in speech synthesis
First Claim
1. A method for automatic segmentation of speech to generate a speech inventory, the method comprising:
- initializing, via a processor, a Hidden Markov Model (HMM) using seed input data;
performing a segmentation of the HMM into speech units to generate phone labels;
correcting, via the processor, the segmentation of the speech units by performing the steps;
re-estimating the HMM based on a current version of the phone labels;
embedded re-estimating of the HMM; and
updating the current version of the phone labels using spectral boundary correction.
4 Assignments
0 Petitions
Accused Products
Abstract
A method and system are disclosed that automatically segment speech to generate a speech inventory. The method includes initializing a Hidden Markov Model (HMM) using seed input data, performing a segmentation of the HMM into speech units to generate phone labels, correcting the segmentation of the speech units. Correcting the segmentation of the speech units includes re-estimating the HMM based on a current version of the phone labels, embedded re-estimating of the HMM, and updating the current version of the phone labels using spectral boundary correction. The system includes modules configured to control a processor to perform steps of the method.
-
Citations
20 Claims
-
1. A method for automatic segmentation of speech to generate a speech inventory, the method comprising:
-
initializing, via a processor, a Hidden Markov Model (HMM) using seed input data; performing a segmentation of the HMM into speech units to generate phone labels; correcting, via the processor, the segmentation of the speech units by performing the steps; re-estimating the HMM based on a current version of the phone labels; embedded re-estimating of the HMM; and updating the current version of the phone labels using spectral boundary correction. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A computer-readable storage medium storing a set of program instructions executable on a processor device and usable to reduce speech unit boundaries, the instructions causing the processing device to perform the steps:
-
aligning a trained set of HMMs to produce phone labels that are segmented, wherein each phone label has a spectral boundary; performing a spectral boundary correction on the phone labels, wherein spectral boundary correction re-aligns each spectral boundary using bending points of spectral transitions; and synthesizing speech using the phone labels having spectral boundary correction. - View Dependent Claims (9, 10, 11, 12, 13, 14, 15, 16)
-
-
17. A system for automatic segmentation of speech to generate a speech inventory, the system comprising:
-
a processor; a first module configured to control the processor to initialize a Hidden Markov Model (HMM) using seed input data; a second module configured to control the processor to perform a segmentation of the HMM into speech units to generate phone labels; a third module configured to control the processor to correct the segmentation of the speech units by performing the steps; re-estimating the HMM based on a current version of the phone labels; embedded re-estimating of the HMM; and updating the current version of the phone labels using spectral boundary correction. - View Dependent Claims (18, 19, 20)
-
Specification