Method and apparatus for training a prosody statistic model and prosody parsing, method and system for text to speech synthesis
First Claim
1. A method for training a prosody statistic model with a raw corpus that includes a plurality of sentences with punctuation, comprising:
- transforming said plurality of sentences in said raw corpus into a plurality of token sequences respectively;
counting frequency of each adjacent token pair occurring in said plurality of token sequences and frequency of punctuation that represents a pause occurring at associated positions of said each token pair;
calculating pause probabilities at said associated positions of said each token pair, based on the frequency of each adjacent token pair and the frequency of punctuation; and
constructing said prosody statistic model based on said token pairs and said pause probabilities at associated positions thereof,wherein the transforming, the counting, the calculating and the constructing, are executed by a computer.
4 Assignments
0 Petitions
Accused Products
Abstract
The present invention provides a method and apparatus for training a prosody statistic model and prosody parsing, a method and system for text to speech synthesis. Said method for training a prosody statistic model with a raw corpus that includes a plurality of sentences with punctuation, comprising: transforming said plurality of sentences in said raw corpus into a plurality of token sequences respectively; counting a frequency for each adjacent token pair occurring in said plurality of token sequences and frequencies of punctuation that represents a pause occurring at associated positions of said each token pair; calculating pause probabilities at said associated positions of said each token pair; and constructing said prosody statistic model based on said token pairs and said pause probabilities at associated positions thereof. With the present invention a prosody statistic model can be trained from a raw corpus without manually prosody parsing tags. And the prosody statistic model can be used in the prosody parsing and further voice synthesis.
12 Citations
35 Claims
-
1. A method for training a prosody statistic model with a raw corpus that includes a plurality of sentences with punctuation, comprising:
-
transforming said plurality of sentences in said raw corpus into a plurality of token sequences respectively; counting frequency of each adjacent token pair occurring in said plurality of token sequences and frequency of punctuation that represents a pause occurring at associated positions of said each token pair; calculating pause probabilities at said associated positions of said each token pair, based on the frequency of each adjacent token pair and the frequency of punctuation; and constructing said prosody statistic model based on said token pairs and said pause probabilities at associated positions thereof, wherein the transforming, the counting, the calculating and the constructing, are executed by a computer. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. An apparatus for training a prosody statistic model with a raw corpus that includes a plurality of sentences with punctuation, comprising:
-
a tokenization unit configured to transform said plurality of sentences in said raw corpus into a plurality of token sequences respectively; a counter configured to count frequency of each adjacent token pair occurring in said plurality of token sequences and frequency of punctuation that represents a pause occurring at associated positions of said each token pair; a pause probability calculator configured to calculate pause probabilities at said associated positions of said each token pair, based on the frequency of each adjacent token pair and the frequency of punctuation; and a prosody statistic model constructor configured to construct said prosody statistic model based on said token pairs and said pause probabilities at associated positions thereof. - View Dependent Claims (20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35)
-
Specification