METHOD AND APPARATUS FOR TRAINING A PROSODY STATISTIC MODEL AND PROSODY PARSING, METHOD AND SYSTEM FOR TEXT TO SPEECH SYNTHESIS
First Claim
1. A method for training a prosody statistic model with a raw corpus that includes a plurality of sentences with punctuation, comprising:
- transforming said plurality of sentences in said raw corpus into a plurality of token sequences respectively;
counting a frequency for each adjacent token pair occurring in said plurality of token sequences and frequencies of punctuation that represents a pause occurring at associated positions of said each token pair;
calculating pause probabilities at said associated positions of said each token pair; and
constructing said prosody statistic model based on said token pairs and said pause probabilities at associated positions thereof.
4 Assignments
0 Petitions
Accused Products
Abstract
The present invention provides a method and apparatus for training a prosody statistic model and prosody parsing, a method and system for text to speech synthesis. Said method for training a prosody statistic model with a raw corpus that includes a plurality of sentences with punctuation, comprising: transforming said plurality of sentences in said raw corpus into a plurality of token sequences respectively; counting a frequency for each adjacent token pair occurring in said plurality of token sequences and frequencies of punctuation that represents a pause occurring at associated positions of said each token pair; calculating pause probabilities at said associated positions of said each token pair; and constructing said prosody statistic model based on said token pairs and said pause probabilities at associated positions thereof. With the present invention a prosody statistic model can be trained from a raw corpus without manually prosody parsing tags. And the prosody statistic model can be used in the prosody parsing and further voice synthesis.
-
Citations
36 Claims
-
1. A method for training a prosody statistic model with a raw corpus that includes a plurality of sentences with punctuation, comprising:
-
transforming said plurality of sentences in said raw corpus into a plurality of token sequences respectively;
counting a frequency for each adjacent token pair occurring in said plurality of token sequences and frequencies of punctuation that represents a pause occurring at associated positions of said each token pair;
calculating pause probabilities at said associated positions of said each token pair; and
constructing said prosody statistic model based on said token pairs and said pause probabilities at associated positions thereof. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. An apparatus for training a prosody statistic model with a raw corpus that includes a plurality of sentences with punctuation, comprising:
-
a tokenization unit configured to transform said plurality of sentences in said raw corpus into a plurality of token sequences respectively;
a counter configured to count a frequency for each adjacent token pair occurring in said plurality of token sequences and frequencies of punctuation that represents a pause occurring at associated positions of said each token pair;
a pause probability calculator configured to calculate pause probabilities at said associated positions of said each token pair; and
a prosody statistic model constructor configured to construct said prosody statistic model based on said token pairs and said pause probabilities at associated positions thereof. - View Dependent Claims (20, 21, 22, 23, 24, 25, 26, 27)
-
-
28. An apparatus for prosody parsing, comprising:
-
a text input unit configured to input a text for prosody parsing, which includes at least one sentence;
a tokenization unit configured to transform the sentence into a token sequence;
a pause weight calculator configured to calculate a pause weight for each pause position in said token sequence based on a prosody statistic model that is trained from a raw corpus and includes a plurality of token pairs and pause probabilities at associated positions of each said plurality of token pairs; and
a pause tag setting unit configured to select at least one pause positions to insert a pause tag according to said calculated pause weight for each pause position. - View Dependent Claims (29, 30, 31, 32, 33, 34, 35, 36)
-
Specification