Method, apparatus and computer program product providing prosodic-categorical enhancement to phrase-spliced text-to-speech synthesis
First Claim
1. A computer program product comprising a computer useable medium including a computer readable program, wherein the computer readable program when executed on the computer causes the computer to operate in accordance with a text-to-speech synthesis function by operations comprising:
- labeling a phrase according to a symbolic categorization of prosodic phenomena; and
constructing a data structure that comprises word/prosody-categories and word/prosody-category sequences for the phrase, and that further provides a phone sequence associated with the phrase.
2 Assignments
0 Petitions
Accused Products
Abstract
Disclosed is a method, a system and a computer program product for text-to-speech synthesis. The computer program product comprises a computer useable medium including a computer readable program, where the computer readable program when executed on the computer causes the computer to operate in accordance with a text-to-speech synthesis function by operations that include, responsive to at least one phrase represented as recorded human speech to be employed in synthesizing speech, labeling the phrase according to a symbolic categorization of prosodic phenomena; and constructing a data structure that includes word/prosody-categories and word/prosody-category sequences for the phrase, and that further includes information pertaining to a phone sequence associated with the constituent word or word sequence for the phrase.
78 Citations
20 Claims
-
1. A computer program product comprising a computer useable medium including a computer readable program, wherein the computer readable program when executed on the computer causes the computer to operate in accordance with a text-to-speech synthesis function by operations comprising:
-
labeling a phrase according to a symbolic categorization of prosodic phenomena; and
constructing a data structure that comprises word/prosody-categories and word/prosody-category sequences for the phrase, and that further provides a phone sequence associated with the phrase. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A text-to-speech synthesis system comprising:
-
means, responsive to at least one phrase represented as recorded human speech to be employed in synthesizing speech, for labeling a constituent word or word sequence of the phrase according to a symbolic categorization of prosodic phenomena; and
means for constructing a data structure comprising word/prosody-categories and word/prosody-category sequences for the phrase, and that further comprises information pertaining to a phone sequence associated with the constituent word or word sequence for the phrase. - View Dependent Claims (13, 14, 15, 16, 17, 18)
-
-
19. A method to operate a text-to-speech synthesis system, comprising:
-
responsive to at least one phrase represented as recorded human speech to be employed in synthesizing speech, labeling the phrase in accordance with a symbolic categorization of prosodic phenomena;
constructing a data structure that comprises word/prosody-categories and word/prosody-category sequences for the phrase, and that further includes information pertaining to a phone sequence associated with the constituent word or word sequence for the phrase;
responsive to input text to be converted to speech, labeling phrases of the input text with a target prosodic category;
comparing the input text to data in the data structure to identify an occurrences of a phrase labeled with prosody categories corresponding to the input text for constructing a phone sequence; and
constructing output speech according to the phone sequence, where if comparing the input text to data in the data structure does not identify an occurrence of a phrase, obtaining instead a phonetic or sub-phonetic representation. - View Dependent Claims (20)
-
Specification