Fundamental frequency pattern generation apparatus and fundamental frequency pattern generation method
First Claim
1. A fundamental frequency pattern generation apparatus comprising:
- a computer apparatus comprising a non-transitory computer readable storage medium and a processor;
a first storage unit comprising the non-transitory computer readable storage medium storing a plurality of representative vectors each corresponding to a prosodic control unit and having a first section including a plurality of sample points and a section except for the first section, wherein the first section is a section of the representative vector, which starts with one of an accent nucleus phoneme, an accent nucleus succeeding adjacent phoneme, and an accent nucleus succeeding second phoneme and ends with one of a prosodic control unit end phoneme, a prosodic control unit end preceding adjacent phoneme, and prosodic control unit end preceding second phoneme;
a second storage unit comprising the non-transitory computer readable storage medium storing a rule to select a representative vector corresponding to an input context;
a selection unit configured to select the representative vector corresponding to the input context from the plurality of representative vectors by applying the rule to the input context and output the selected representative vector;
a calculation unit comprising the processor configured to calculate, using a mapping function, an expansion/contraction ratio for a number of phonemes included in the first section of the selected representative vector based on first designated values for a number of phonemes included in a first portion of a fundamental frequency pattern to be generated from the first section of the selected representative vector, the first designated values being required for the fundamental frequency pattern to be generated, such that the number of the phonemes included in the first section of the selected representative vector equals the first designated value, andan expansion/contraction unit comprising the processor configured to expand/contract the number of the phonemes included in the first section of the selected representative vector based on the expansion/contraction ratio, and then to expand/contract each of the phoneme durations of the phonemes included in all sections of the selected representative vector after the number of the phonemes included in the first section are expanded/contracted, based on second designated values corresponding to phoneme durations of all phonemes included in all portions of the fundamental frequency pattern, the second designated values being required for the fundamental frequency pattern to be generated, such that the phoneme durations of the phonemes included in all sections of the selected representative vector after the number of the phonemes included in the first section are expanded/contracted equal the second designated values corresponding to the phoneme durations, to generate the fundamental frequency pattern.
1 Assignment
0 Petitions
Accused Products
Abstract
A fundamental frequency pattern generation apparatus includes a first storage including representative vectors each corresponding to a prosodic control unit and having a section for changing the number of phonemes, a second storage unit including a rule to select a vector corresponding to an input context, a selection unit configured to select a vector from the representative vectors by applying the rule to the context and output the selected vector, a calculation unit configured to calculate an expansion/contraction ratio of the section of the selected vector in a time-axis direction based on a designated value for a specific feature amount related to a length of a fundamental frequency pattern to be generated, the designated value of the feature amount being required of the fundamental frequency pattern to be generated, and an expansion/contraction unit configured to expand/contract the selected vector based on the expansion/contraction ratio to generate the fundamental frequency pattern.
52 Citations
30 Claims
-
1. A fundamental frequency pattern generation apparatus comprising:
-
a computer apparatus comprising a non-transitory computer readable storage medium and a processor; a first storage unit comprising the non-transitory computer readable storage medium storing a plurality of representative vectors each corresponding to a prosodic control unit and having a first section including a plurality of sample points and a section except for the first section, wherein the first section is a section of the representative vector, which starts with one of an accent nucleus phoneme, an accent nucleus succeeding adjacent phoneme, and an accent nucleus succeeding second phoneme and ends with one of a prosodic control unit end phoneme, a prosodic control unit end preceding adjacent phoneme, and prosodic control unit end preceding second phoneme; a second storage unit comprising the non-transitory computer readable storage medium storing a rule to select a representative vector corresponding to an input context; a selection unit configured to select the representative vector corresponding to the input context from the plurality of representative vectors by applying the rule to the input context and output the selected representative vector; a calculation unit comprising the processor configured to calculate, using a mapping function, an expansion/contraction ratio for a number of phonemes included in the first section of the selected representative vector based on first designated values for a number of phonemes included in a first portion of a fundamental frequency pattern to be generated from the first section of the selected representative vector, the first designated values being required for the fundamental frequency pattern to be generated, such that the number of the phonemes included in the first section of the selected representative vector equals the first designated value, and an expansion/contraction unit comprising the processor configured to expand/contract the number of the phonemes included in the first section of the selected representative vector based on the expansion/contraction ratio, and then to expand/contract each of the phoneme durations of the phonemes included in all sections of the selected representative vector after the number of the phonemes included in the first section are expanded/contracted, based on second designated values corresponding to phoneme durations of all phonemes included in all portions of the fundamental frequency pattern, the second designated values being required for the fundamental frequency pattern to be generated, such that the phoneme durations of the phonemes included in all sections of the selected representative vector after the number of the phonemes included in the first section are expanded/contracted equal the second designated values corresponding to the phoneme durations, to generate the fundamental frequency pattern. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A fundamental frequency pattern generation apparatus comprising:
-
a computer apparatus comprising a non-transitory computer readable storage medium and a processor; a first storage unit comprising the non-transitory computer readable storage medium storing a plurality of representative vectors each corresponding to a prosodic control unit and having a first section and a section except the first section, wherein the first section is a section of the representative vector, which starts with one of an accent nucleus phoneme, an accent nucleus succeeding adjacent phoneme, and an accent nucleus succeeding second phoneme and ends with one of a prosodic control unit end phoneme, a prosodic control unit end preceding adjacent phoneme, and a prosodic control unit end preceding second phoneme; a second storage unit comprising the non-transitory computer readable storage medium storing a rule to select a representative vector corresponding to an input context; a selection unit configured to select the representative vector corresponding to the input context from the plurality of representative vectors by applying the rule to the input context and output the selected representative vector; a calculation unit comprising the processor configured to calculate an expansion/contraction ratio for number of phonemes included in the first section of the selected representative vector, based on a first designated value for a number of phonemes included in a first portion of a fundamental frequency pattern to be generated from the first section of the selected representative vector, the first designated value being required for the fundamental frequency pattern to be generated, such that the number of the phonemes included in the first section of the selected representative vector equals the first designated value; and an expansion/contraction unit comprising the processor configured to expand/contract the number of the phonemes included in the first section of the selected representative vector based on the expansion/contraction ratio and then to expand/contract each of phoneme durations of the phonemes included in all sections of the selected representative vector after the number of the phonemes included in the first section are expanded/contracted, based on second designated values corresponding to phoneme durations of all phonemes included in all portions of the fundamental frequency pattern, the second designated values being required for the fundamental frequency pattern to be generated, such that the phoneme durations of the phonemes included in all sections of the selected representative vector after the number of the phonemes included in the first section are expanded/contracted equal the second designated values corresponding to the phoneme durations, to generate the fundamental frequency pattern. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25)
-
-
26. A fundamental frequency pattern generation method comprising:
-
storing in advance a plurality of representative vectors each corresponding to a prosodic control unit and having a first section and a section except the first section, wherein the first section is a section of the representative vector, which starts with one of an accent nucleus phoneme, an accent nucleus succeeding adjacent phoneme, and an accent nucleus succeeding second phoneme and ends with one of a prosodic control unit end phoneme, a prosodic control unit end preceding adjacent phoneme, and a prosodic control unit end preceding second phoneme; storing in advance a rule to select a representative vector corresponding to an input context; selecting, via a computer processor, the representative vector corresponding to the input context from the plurality of representative vectors by applying the rule to the input context and output the selected representative vector; calculating, via the computer processor, an expansion/contraction ratio for number of phonemes included in the first section of the selected representative vector, based on a designated value for number of phonemes included in a first portion of a fundamental frequency pattern to be generated from the first section of the selected representative vector, the designated value being required for the fundamental frequency pattern to be generated, such that the number of the phonemes included in the first section of the selected representative vector equals the designated value; and expanding/contracting, via the computer processor, the number of the phonemes included in the first section of the selected representative vector based on the expansion/contraction ratio, and then expanding/contracting each of phoneme durations of the phonemes included in all sections of the selected representative vector after the number of the phonemes included in the first section are expanded/contracted, based on designated values corresponding to phoneme durations of all phonemes included in all portions of the fundamental frequency pattern, the designated values being required for the fundamental frequency pattern to be generated, such that the phoneme durations of the phonemes included in all sections of the selected representative vector after the number of the phonemes included in the first section are expanded/contracted equal the designated values corresponding to the phoneme durations, to generate the fundamental frequency pattern.
-
-
27. A non-transitory computer readable storage medium storing instructions of a computer program which when executed by a computer results in performance of steps comprising:
-
storing in advance a plurality of representative vectors each corresponding to a prosodic control unit and having a first section and a section except the first section, wherein the first section is a section of the representative vector, which starts with one of an accent nucleus phoneme, an accent nucleus succeeding adjacent phoneme, and an accent nucleus succeeding second phoneme and ends with one of a prosodic control unit end phoneme, a prosodic control unit end preceding adjacent phoneme, and a prosodic control unit end preceding second phoneme; storing in advance a rule to select a representative vector corresponding to an input context; selecting the representative vector corresponding to the input context from the plurality of representative vectors by applying the rule to the input context and output the selected representative vector; calculating an expansion/contraction ratio for number of phonemes included in the first section of the selected representative vector, based on a designated value for number of phonemes included in a first portion of a fundamental frequency pattern to be generated from the first section of the selected representative vector, the designated value being required for the fundamental frequency pattern to be generated, such that the number of the phonemes included in the first section of the selected representative vector equals the designated value; and expanding/contracting the number of the phonemes included in the first section of the selected representative vector based on the expansion/contraction ratio, and then expanding/contracting each of phoneme durations of the phonemes included in all sections of the selected representative vector after the number of the phonemes included in the first section are expanded/contracted, based on designated values corresponding to phoneme durations of all phonemes included in all portions of the fundamental frequency pattern, the designated values being required for the fundamental frequency pattern to be generated, such that the phoneme durations of the phonemes included in all sections of the selected representative vector after the number of the phonemes included in the first section are expanded/contracted equal the designated values corresponding to the phoneme durations, to generate the fundamental frequency pattern.
-
-
28. A fundamental frequency pattern generation method comprising:
-
storing, in non-transitory storage medium, a plurality of representative vectors each corresponding to a prosodic control unit and having a first section and a section except the first section, wherein the first section is a section of a representative vector; storing, in non-transitory storage medium, a rule to select a representative vector corresponding to an input context; selecting, via a computer processor, the representative vector corresponding to the input context from the plurality of representative vectors by applying the rule to the input context and output the selected representative vector; calculating, via the computer processor, an expansion/contraction ratio for a number of phonemes included in the first section of the selected representative vector based on the selected representative vector such that the number of the phonemes included in the first section of the selected representative vector equals the designated value; and expanding/contracting, via the computer processor, first the number of the phonemes included in the first section of the selected representative vector based on the expansion/contraction ratio and then each of phoneme durations of the phonemes.
-
-
29. A fundamental frequency pattern generation method comprising:
-
preparing in advance a first storage unit to store a plurality of representative vectors each corresponding to a prosodic control unit and having a first section including a plurality of sample points and a section except for the first section, wherein the first section is a section of the representative vector, which starts with one of an accent nucleus phoneme, an accent nucleus succeeding adjacent phoneme, and an accent nucleus succeeding second phoneme and ends with one of a prosodic control unit end phoneme, a prosodic control unit end preceding adjacent phoneme, and prosodic control unit end preceding second phoneme, preparing in advance a second storage unit to store a rule to select a representative vector corresponding to an input context, selecting, via a computer processor, the representative vector corresponding to the input context from the plurality of representative vectors by applying the rule to the input context and outputting the selected representative vector; calculating, using a mapping function on the computer processor, an expansion/contraction ratio for a number of phonemes included in the first section of the selected representative vector, based on a designated value for a number of phonemes included in a first portion of a fundamental frequency pattern to be generated from the first section of the selected representative vector, the designated value being required for the fundamental frequency pattern to be generated, such that the number of the phonemes included in the first section of the selected representative vector equals the designated value; and expanding/contracting, via the computer processor, the number of the phonemes included in the first section of the selected representative vector based on the expansion/contraction ratio, and then expanding/contracting each of the phoneme durations of the phonemes included in all sections of the selected representative vector after the number of the phonemes included in the first section are expanded/contracted, based on designated values corresponding to phoneme durations of all phonemes included in all portions of the fundamental frequency pattern, the designated values being required for the fundamental frequency pattern to be generated, such that the phoneme durations of the phonemes included in all sections of the selected representative vector after the number of the phonemes included in the first section are expanded/contracted equal the designated values corresponding to the phoneme durations, to generate the fundamental frequency pattern.
-
-
30. A non-transitory computer readable storage medium storing instructions of a computer program which when executed by a computer results in performance of steps comprising:
-
preparing in advance a first storage unit to store a plurality of representative vectors each corresponding to a prosodic control unit and having a first section including a plurality of sample points and a section except for the first section, wherein the first section is a section of the representative vector, which starts with one of an accent nucleus phoneme, an accent nucleus succeeding adjacent phoneme, and an accent nucleus succeeding second phoneme and ends with one of a prosodic control unit end phoneme, a prosodic control unit end preceding adjacent phoneme, and prosodic control unit end preceding second phoneme, preparing in advance a second storage unit to store a rule to select a representative vector corresponding to an input context, selecting the representative vector corresponding to the input context from the plurality of representative vectors by applying the rule to the input context and outputting the selected representative vector; calculating, using a mapping function on the computer processor, an expansion/contraction ratio for a number of phonemes included in the first section of the selected representative vector, a designated value for a number of phonemes included in a first portion of a fundamental frequency pattern to be generated from the first section of the selected representative vector, the designated value being required for the fundamental frequency pattern to be generated, such that the number of the phonemes included in the first section of the selected representative vector equals the designated value; and expanding/contracting, via the computer processor, the number of the phonemes included in the first section of the selected representative vector based on the expansion/contraction ratio, and then expanding/contracting each of the phoneme durations of the phonemes included in all sections of the selected representative vector after the number of the phonemes included in the first section are expanded/contracted, based on designated values corresponding to phoneme durations of all phonemes included in all portions of the fundamental frequency pattern, the designated values being required for the fundamental frequency pattern to be generated, such that the phoneme durations of the phonemes included in all sections of the selected representative vector after the number of the phonemes included in the first section are expanded/contracted equal the designated values corresponding to the phoneme durations, to generate the fundamental frequency pattern.
-
Specification