PERSONALIZED TEXT-TO-SPEECH SYNTHESIS AND PERSONALIZED SPEECH FEATURE EXTRACTION
First Claim
1. A personalized text-to-speech synthesizing device, comprising:
- a personalized speech feature library creator, configured to recognize personalized speech features of a specific speaker by comparing a random speech fragment of the specific speaker with preset keywords, thereby to create a personalized speech feature library associated with the specific speaker, and store the personalized speech feature library in association with the specific speaker; and
a text-to-speech synthesizer, configured to perform a speech synthesis of a text message from the specific speaker, based on the personalized speech feature library associated with the specific speaker and created by the personalized speech feature library creator, thereby to generate and output a speech fragment having pronunciation characteristics of the specific speaker.
3 Assignments
0 Petitions
Accused Products
Abstract
A personalized text-to-speech synthesizing device includes: a personalized speech feature library creator, configured to recognize personalized speech features of a specific speaker by comparing a random speech fragment of the specific speaker with preset keywords, thereby to create a personalized speech feature library associated with the specific speaker, and store the personalized speech feature library in association with the specific speaker; and a text-to-speech synthesizer, configured to perform a speech synthesis of a text message from the specific speaker, based on the personalized speech feature library associated with the specific speaker and created by the personalized speech feature library creator, thereby to generate and output a speech fragment having pronunciation characteristics of the specific speaker. A personalized speech feature library of a specific speaker is established without a deliberate training process, and a text is synthesized into personalized speech with the speech characteristics of the speaker.
-
Citations
37 Claims
-
1. A personalized text-to-speech synthesizing device, comprising:
-
a personalized speech feature library creator, configured to recognize personalized speech features of a specific speaker by comparing a random speech fragment of the specific speaker with preset keywords, thereby to create a personalized speech feature library associated with the specific speaker, and store the personalized speech feature library in association with the specific speaker; and a text-to-speech synthesizer, configured to perform a speech synthesis of a text message from the specific speaker, based on the personalized speech feature library associated with the specific speaker and created by the personalized speech feature library creator, thereby to generate and output a speech fragment having pronunciation characteristics of the specific speaker. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 18, 19, 20, 21, 22, 23, 24, 25)
-
-
9. A personalized text-to-speech synthesizing method, comprising:
-
presetting one or more keywords with respect to a specific language; receiving a random speech fragment of a specific speaker; recognizing personalized speech features of the specific speaker by comparing the received speech fragment of the specific speaker with the preset keywords, thereby creating a personalized speech feature library associated with the specific speaker, and storing the personalized speech feature library in association with the specific speaker; and performing a speech synthesis of a text message from the specific speaker, based on the personalized speech feature library associated with the specific speaker, thereby generating and outputting a speech fragment having pronunciation characteristics of the specific speaker. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16, 17)
-
-
26. A personalized speech feature extraction device, comprising:
-
a keyword setting unit, configured to set one or more keywords suitable for reflecting the pronunciation characteristics of a specific speaker with respect to a specific language, and store the keywords in association with the specific speaker; a speech feature recognition unit, configured to recognize whether any keyword associated with the specific speaker occurs in a random speech fragment of the specific speaker, and when a keyword associated with the specific speaker is recognized as occurring in the speech fragment of the specific speaker, recognize the speech features of the specific speaker according to a standard pronunciation of the recognized keyword and the pronunciation of the speaker; and a speech feature filtration unit, configured to filter out abnormal speech features through statistical analysis while remain speech features reflecting the normal pronunciation characteristics of the specific speaker, when the speech features of the specific speaker recognized by the speech feature recognition unit reach a predetermined number, thereby to create a personalized speech feature library associated with the specific speaker, and store the personalized speech feature library in association with the specific speaker. - View Dependent Claims (27, 28, 29, 30, 31)
-
-
32. A personalized speech feature extraction method, comprising:
-
setting one or more keywords suitable for reflecting the pronunciation characteristics of a specific speaker with respect to a specific language, and storing the keywords in association with the specific speaker; recognizing whether any keyword associated with the specific speaker occurs in a random speech fragment of the specific speaker, and when a keyword associated with the specific speaker is recognized as occurring in the speech fragment of the specific speaker, recognizing the speech features of the specific speaker according to a standard pronunciation of the recognized keyword and the pronunciation of the speaker; and filtering out abnormal speech features through statistical analysis while remaining speech features reflecting the normal pronunciation characteristics of the specific speaker, when the speech features of the specific speaker recognized by the speech feature recognition unit reach a predetermined number, thereby creating a personalized speech feature library associated with the specific speaker, and storing the personalized speech feature library in association with the specific speaker. - View Dependent Claims (33, 34, 35, 36, 37)
-
Specification