Method of speaking rate conversion in text-to-speech system
First Claim
1. A method of a speaking rate conversion in a text-to-speech system, the method comprising:
- a first step of extracting a vocal list from a synthesis DB (database), voicing the extracted vocal list in each speaking style constituted of fast speaking, normal speaking, and slow speaking, and building a probability distribution of a synthesis unit-based duration;
a second step of searching for an optimal synthesis unit candidate row using a viterbi search, correspondingly to a requested synthesis, and creating a target duration parameter of a synthesis unit; and
a third step of again obtaining an optimal synthesis unit candidate row using the duration parameter of the optimal synthesis unit candidate row, and generating a synthesized sound.
1 Assignment
0 Petitions
Accused Products
Abstract
A method of a speaking rate conversion in a text-to-speech system is provided. The method includes: a first step of extracting a vocal list from a synthesis DB (database), voicing the extracted vocal list in each speaking style constituted of fast speaking, normal speaking, and slow speaking, and building a probability distribution of a synthesis unit-based duration; a second step of searching for an optimal synthesis unit candidate row using a viterbi search, correspondingly to a requested synthesis, and creating a target duration parameter of a synthesis unit; and a third step of again obtaining an optimal synthesis unit candidate row using the duration parameter of the optimal synthesis unit candidate row, and generating a synthesized sound.
-
Citations
6 Claims
-
1. A method of a speaking rate conversion in a text-to-speech system, the method comprising:
-
a first step of extracting a vocal list from a synthesis DB (database), voicing the extracted vocal list in each speaking style constituted of fast speaking, normal speaking, and slow speaking, and building a probability distribution of a synthesis unit-based duration;
a second step of searching for an optimal synthesis unit candidate row using a viterbi search, correspondingly to a requested synthesis, and creating a target duration parameter of a synthesis unit; and
a third step of again obtaining an optimal synthesis unit candidate row using the duration parameter of the optimal synthesis unit candidate row, and generating a synthesized sound. - View Dependent Claims (2, 3, 4, 5, 6)
-
Specification