SPEECH SYNTHESIS DICTIONARY GENERATION APPARATUS, SPEECH SYNTHESIS DICTIONARY GENERATION METHOD AND COMPUTER PROGRAM PRODUCT
First Claim
1. A speech synthesis dictionary generation apparatus for generating a speech synthesis dictionary containing a model of an object speaker based on speech data of the object speaker, the apparatus comprising:
- a speech analyzer configured to analyze the speech data and generate a speech database containing data representing characteristics of utterance by the object speaker;
a speaker adapter configured to generate the model of the object speaker by performing speaker adaptation of converting a predetermined base model to be closer to characteristics of the object speaker based on the speech database;
a target speaker level designation unit configured to accept designation of a target speaker level that is a speaker level to be targeted, the speaker level representing at least one of a speaker'"'"'s utterance skill and a speaker'"'"'s native level in a language of the speech synthesis dictionary; and
a determination unit configured to determine a value of a parameter related to fidelity of reproduction of speaker properties in the speaker adaptation, in accordance with a relationship between the designated target speaker level and an object speaker level that is the speaker level of the object speaker, whereinthe determination unit is configured to determine the value of the parameter so that the fidelity is lower when the designated target speaker level is higher than the object speaker level, compared to when the designated target speaker level is not higher than the object speaker level, andthe speaker adapter is configured to perform the speaker adaptation in accordance with the value of a parameter determined by the determination unit.
4 Assignments
0 Petitions
Accused Products
Abstract
According to an embodiment, a speech synthesis dictionary generation apparatus includes an analyzer, a speaker adapter, a level designation unit, and a determination unit. The analyzer is configured to analyze speech data and generate a speech database containing characteristics of utterance by an object speaker. The speaker adapter is configured to generate the model of the object speaker by speaker adaptation of converting a base model to be closer to characteristics of the object speaker based on the database. The level designation unit is configured to accept designation of a target speaker level representing a speaker'"'"'s utterance skill and/or a speaker'"'"'s native level in a language of the speech synthesis dictionary. The determination unit is configured to determine a parameter related to fidelity of reproduction of speaker properties in the speaker adaptation, in accordance with a relationship between the target speaker level and a speaker level of the object speaker.
15 Citations
11 Claims
-
1. A speech synthesis dictionary generation apparatus for generating a speech synthesis dictionary containing a model of an object speaker based on speech data of the object speaker, the apparatus comprising:
-
a speech analyzer configured to analyze the speech data and generate a speech database containing data representing characteristics of utterance by the object speaker; a speaker adapter configured to generate the model of the object speaker by performing speaker adaptation of converting a predetermined base model to be closer to characteristics of the object speaker based on the speech database; a target speaker level designation unit configured to accept designation of a target speaker level that is a speaker level to be targeted, the speaker level representing at least one of a speaker'"'"'s utterance skill and a speaker'"'"'s native level in a language of the speech synthesis dictionary; and a determination unit configured to determine a value of a parameter related to fidelity of reproduction of speaker properties in the speaker adaptation, in accordance with a relationship between the designated target speaker level and an object speaker level that is the speaker level of the object speaker, wherein the determination unit is configured to determine the value of the parameter so that the fidelity is lower when the designated target speaker level is higher than the object speaker level, compared to when the designated target speaker level is not higher than the object speaker level, and the speaker adapter is configured to perform the speaker adaptation in accordance with the value of a parameter determined by the determination unit. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A speech synthesis dictionary generation method executed in a speech synthesis dictionary generation apparatus for generating a speech synthesis dictionary containing a model of an object speaker based on speech data of the object speaker, the method comprising:
-
analyzing the speech data to generate a speech database containing data representing characteristics of utterance by the object speaker; generating the model of the object speaker by performing speaker adaptation of converting a predetermined base model to be closer to characteristics of the object speaker based on the speech database; accepting designation of a target speaker level that is a speaker level to be targeted, the speaker level representing at least one of a speaker'"'"'s utterance skill and a speaker'"'"'s native level in a language of the speech synthesis dictionary; and determining a value of a parameter related to fidelity of reproduction of speaker properties in the speaker adaptation, in accordance with a relationship between the designated target speaker level and an object speaker level that is the speaker level of the object speaker, wherein the determining includes determining the value of the parameter so that the fidelity is lower when the designated target speaker level is higher than the object speaker level, compared to when the designated target speaker level is not higher than the object speaker level, and the generating includes performing the speaker adaptation in accordance with the value of a parameter determined at the determining.
-
-
11. A computer program product comprising a computer-readable medium containing a program for generating a speech synthesis dictionary containing a model of an object speaker based on speech data of the object speaker, the program causing a computer to execute:
-
analyzing the speech data to generate a speech database containing data representing characteristics of utterance by the object speaker; generating the model of the object speaker by performing speaker adaptation of converting a predetermined base model to be closer to characteristics of the object speaker based on the speech database; accepting designation of a target speaker level that is a speaker level to be targeted, the speaker level representing at least one of a speaker'"'"'s utterance skill and a speaker'"'"'s native level in a language of the speech synthesis dictionary; and determining a value of a parameter related to fidelity of reproduction of speaker properties in the speaker adaptation, in accordance with a relationship between the designated target speaker level and an object speaker level that is the speaker level of the object speaker, wherein the determining includes determining the value of the parameter so that the fidelity is lower when the designated target speaker level is higher than the object speaker level, compared to when the designated target speaker level is not higher than the object speaker level, and the generating includes performing the speaker adaptation in accordance with the value of a parameter determined at the determining.
-
Specification