TEXT-TO-SPEECH DEVICE, TEXT-TO-SPEECH METHOD, AND COMPUTER PROGRAM PRODUCT
First Claim
1. A text-to-speech device comprising:
- a context acquirer configured to acquire a context sequence that is an information sequence affecting fluctuations in voice;
an acoustic model parameter acquirer configured to acquire an acoustic model parameter sequence corresponding to the context sequence, the acoustic model parameter sequence representing a standard speaking style of a target speaker;
a conversion parameter acquirer configured to acquire a conversion parameter sequence corresponding to the context sequence, the conversion parameter sequence being used in converting an acoustic model parameter in the standard speaking style into one in a speaking style different from the standard speaking style;
a converter configured to convert the acoustic model parameter sequence using the conversion parameter sequence; and
a waveform generator configured to generate a voice signal based on the acoustic model parameter sequence acquired after conversion.
4 Assignments
0 Petitions
Accused Products
Abstract
According to an embodiment, a text-to-speech device includes a context acquirer, an acoustic model parameter acquirer, a conversion parameter acquirer, a converter, and a waveform generator. The context acquirer is configured to acquire a context sequence affecting fluctuations in voice. The acoustic model parameter acquirer is configured to acquire an acoustic model parameter sequence that corresponds to the context sequence and represents an acoustic model in a standard speaking style of a target speaker. The conversion parameter acquirer is configured to acquire a conversion parameter sequence corresponding to the context sequence to convert an acoustic model parameter in the standard speaking style into one in a different speaking style. The converter is configured to convert the acoustic model parameter sequence using the conversion parameter sequence. The waveform generator is configured to generate a voice signal based on the acoustic model parameter sequence acquired after conversion.
17 Citations
14 Claims
-
1. A text-to-speech device comprising:
-
a context acquirer configured to acquire a context sequence that is an information sequence affecting fluctuations in voice; an acoustic model parameter acquirer configured to acquire an acoustic model parameter sequence corresponding to the context sequence, the acoustic model parameter sequence representing a standard speaking style of a target speaker; a conversion parameter acquirer configured to acquire a conversion parameter sequence corresponding to the context sequence, the conversion parameter sequence being used in converting an acoustic model parameter in the standard speaking style into one in a speaking style different from the standard speaking style; a converter configured to convert the acoustic model parameter sequence using the conversion parameter sequence; and a waveform generator configured to generate a voice signal based on the acoustic model parameter sequence acquired after conversion. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A text-to-speech method comprising:
-
acquiring a context sequence that is an information sequence affecting fluctuations in voice; acquiring an acoustic model parameter sequence corresponding to the context sequence, the acoustic model parameter sequence representing an acoustic model in a standard speaking style of a target speaker; acquiring a conversion parameter sequence corresponding to the context sequence, the conversion parameter sequence being used in converting an acoustic model parameter in the standard speaking style into one in a speaking style different from the standard speaking style; converting the acoustic model parameter sequence using the conversion parameter sequence; and generating a voice signal based on the acoustic model parameter sequence acquired after conversion.
-
-
14. A computer program product comprising a computer-readable medium containing a program executed by a computer, the program causing the computer to execute:
-
acquiring a context sequence that is an information sequence affecting fluctuations in voice; acquiring an acoustic model parameter sequence corresponding to the context sequence, the acoustic model parameter sequence representing an acoustic model in a standard speaking style of a target speaker; acquiring a conversion parameter sequence corresponding to the context sequence, the conversion parameter sequence being used in converting an acoustic model parameter in the standard speaking style into one in a speaking style different from the standard speaking style; converting the acoustic model parameter sequence using the conversion parameter sequence; and generating a voice signal based on the acoustic model parameter sequence acquired after conversion.
-
Specification