Speech synthesis with prosodic phrase boundary information
First Claim
1. A method of converting text to speech said method comprising:
- receiving an input word sequence in the form of text;
comparing said input word sequence with each one of a plurality of reference word sequence, said plurality of reference word sequences including prosodic phrase boundary information;
identifying one or more reference word sequences which most closely match said input word sequence; and
predicting prosodic phrase boundaries for a synthesized spoken version of the input text on the basis of the prosodic phrase boundary information included with said one or more most closely matching reference word sequences.
1 Assignment
0 Petitions
Accused Products
Abstract
Text-to-speech conversion uses pattern-matching to predict the position of phrase boundaries in spoken output. Text input to the is analyzed to identify groups of words (known as “chunks”) which are unlikely to contain internal phrase boundaries. Both the chunks and individual words are labeled with their syntactic characteristics. Access is made to a database of sentences which also contains such syntactic labels, together with indications of where a human reader would insert minor and major phrase boundaries. The parts of the database which have the most similar syntactic characteristics are found and phrase boundaries are predicted based on the phrase boundaries found in those parts. Other characteristics may also be used in the pattern-matching process.
-
Citations
10 Claims
-
1. A method of converting text to speech said method comprising:
-
receiving an input word sequence in the form of text; comparing said input word sequence with each one of a plurality of reference word sequence, said plurality of reference word sequences including prosodic phrase boundary information; identifying one or more reference word sequences which most closely match said input word sequence; and predicting prosodic phrase boundaries for a synthesized spoken version of the input text on the basis of the prosodic phrase boundary information included with said one or more most closely matching reference word sequences. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A text to speech conversion apparatus comprising:
-
a word sequence store storing a plurality of reference word sequence, said plurality of reference word sequences including prosodic phrase boundary information; a program store storing a program; a processor in communication with said program store and said word sequence store; means for receiving an input word sequence in the form of text; wherein said program is executable to control said processor to; compare said input word sequence with each one of a plurality of said reference word sequences; identify one or more reference word sequences which most closely match said input word sequence; and derive prosodic phrase boundary information for the input text on the basis of the prosodic phrase boundary information included with said one or more most closely matching reference word sequences.
-
-
10. A text to speech conversion apparatus comprising:
-
receiving means arranged in operation to receive an input word sequence in the form of text; a word sequence store storing a plurality of reference word sequences, said plurality of reference word sequences including prosodic phrase boundary information; comparison means arranged in operation to compare said input text with each one of a plurality of said reference word sequences; identification means arranged in operation to identify one or more reference word sequences which most closely match said input word sequence; and prosodic phrase boundary prediction means arranged in operation to predict prosodic phrase boundaries for the input text on the basis of the prosodic phrase boundary information included with said one or more most closely matching reference word sequences.
-
Specification