Prosody generating device, prosody generating method, and program
First Claim
1. A prosody generation apparatus that receives phonological information and linguistic information so as to generate prosody, the prosody generation apparatus being operable to refer to (a) a representative prosodic pattern storage unit for accumulating beforehand representative prosodic patterns of portions of speech data, the portions including prosody changing points;
- (b) a selection rule storage unit that stores a selection rule predetermined according to attributes concerning phonology or attributes concerning linguistic information of the portions of the speech data including the prosody changing points; and
(c) a transformation rule storage unit that stores a transformation rule predetermined according to attributes concerning the phonology or the linguistic information of the portions of the speech data including the prosody changing points;
comprising;
a prosody changing point setting unit that sets a prosody changing point according to at least any one of the received phonological information and the linguistic information;
a pattern selection unit that selects a representative prosodic pattern from the representative prosodic pattern storage unit according to the selection rule, based on the received phonological information and the linguistic information; and
a prosody generation unit that transforms the representative prosodic pattern selected by the pattern selection unit according to the transformation rule and interpolates a portion that does not include a prosody changing point and located between the thus selected and transformed representative patterns each corresponding to a portion including a prosody changing point.
4 Assignments
0 Petitions
Accused Products
Abstract
A prosody generation apparatus capable of suppressing distortion that occurs when generating prosodic patterns and therefore generating a natural prosody is provided. A prosody changing point extraction unit in this apparatus extracts a prosody changing point located at the beginning and the ending of a sentence, the beginning and the ending of a breath group, an accent position and the like. A selection rule and a transformation rule of a prosodic pattern including the prosody changing point is generated by means of a statistical or learning technique and the thus generate rules are stored in a representative prosodic pattern selection rule table and a transformation rule table beforehand. A pattern selection unit selects a representative prosodic pattern from the representative prosodic pattern selection rule table according to the selection rule. A prosody generation unit transforms the selected pattern according to the transformation rule and carries out interpolation with respect to portions other than the prosody changing points so as to generate prosody as a whole.
-
Citations
60 Claims
-
1. A prosody generation apparatus that receives phonological information and linguistic information so as to generate prosody, the prosody generation apparatus being operable to refer to (a) a representative prosodic pattern storage unit for accumulating beforehand representative prosodic patterns of portions of speech data, the portions including prosody changing points;
- (b) a selection rule storage unit that stores a selection rule predetermined according to attributes concerning phonology or attributes concerning linguistic information of the portions of the speech data including the prosody changing points; and
(c) a transformation rule storage unit that stores a transformation rule predetermined according to attributes concerning the phonology or the linguistic information of the portions of the speech data including the prosody changing points;
comprising;
a prosody changing point setting unit that sets a prosody changing point according to at least any one of the received phonological information and the linguistic information;
a pattern selection unit that selects a representative prosodic pattern from the representative prosodic pattern storage unit according to the selection rule, based on the received phonological information and the linguistic information; and
a prosody generation unit that transforms the representative prosodic pattern selected by the pattern selection unit according to the transformation rule and interpolates a portion that does not include a prosody changing point and located between the thus selected and transformed representative patterns each corresponding to a portion including a prosody changing point. - View Dependent Claims (2, 3, 4, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56)
- (b) a selection rule storage unit that stores a selection rule predetermined according to attributes concerning phonology or attributes concerning linguistic information of the portions of the speech data including the prosody changing points; and
-
5. A prosody generation apparatus that receives phonological information and linguistic information so as to generate prosody, the prosody generation apparatus being operable to refer to (a) a variation estimation rule storage unit that stores a variation estimation rule of prosody at prosody changing points, the variation estimation rule being predetermined beforehand according to attributes concerning phonology or attributes concerning linguistic information of the prosody changing points of speech data;
- and (b) an absolute value estimation rule storage unit that stores an absolute value estimation rule of the prosody at the prosody changing points, the absolute value estimation rule being predetermined beforehand according to attributes concerning the phonology or the linguistic information of the prosody changing points of the speech data;
comprising;
a prosody changing point setting unit that sets a prosody changing point according to at least any one of the received phonological information and the linguistic information;
a variation estimation unit that estimates a variation of prosody at the prosody changing point according to the estimation rule stored in the variation estimation rule storage unit, based on the received phonological information and the linguistic information;
an absolute value estimation unit that estimates an absolute value of the prosody at the prosody changing point according to the absolute value estimation rule stored in the absolute value estimation rule storage unit, based on the received phonological information and the linguistic information; and
a prosody generation unit that generates prosody for a prosody changing point by shifting the variation estimated by the variation estimation unit so as to correspond to the absolute value obtained by the absolute value estimation unit and generates prosody for a portion other than prosody changing points by carrying out interpolation between the thus generated prosody for prosody changing points. - View Dependent Claims (6, 7, 8, 9, 10, 11, 12)
- and (b) an absolute value estimation rule storage unit that stores an absolute value estimation rule of the prosody at the prosody changing points, the absolute value estimation rule being predetermined beforehand according to attributes concerning the phonology or the linguistic information of the prosody changing points of the speech data;
-
57. A prosody generation method by which phonological information and linguistic information are inputted so as to generate prosody, comprising the steps of:
-
setting a prosody changing point according to at least any one of the inputted phonological information and linguistic information;
selecting a prosodic pattern from representative prosodic patterns for portions including prosody changing points of speech data according to a selection rule predetermined beforehand based on attributes concerning phonology or attributes concerning linguistic information of the portions including the prosodic changing points; and
transforming the selected prosodic pattern according to a transformation rule predetermined beforehand based on attributes concerning the phonology or attributes concerning the linguistic information of the portions including the prosodic changing points, and interpolating a portion that does not include a prosody changing point and located between the thus selected and transformed representative patterns each corresponding to a portion including a prosody changing point.
-
-
58. A prosody generation method by which phonological information and linguistic information are inputted so as to generate prosody, comprising the steps of:
-
setting a prosody changing point according to at least any one of the inputted phonological information and linguistic information;
estimating a variation of prosody at the prosody changing point according to a variation estimation rule predetermined beforehand according to attributes concerning phonology or attributes concerning linguistic information of the prosody changing point of speech data, based on the inputted phonological information and linguistic information;
estimating an absolute value of the prosody at the prosody changing point according to an absolute value estimation rule predetermined beforehand according to attributes concerning the phonology or the linguistic information of the prosody changing point of the speech data, based on the inputted phonological information and the linguistic information; and
generating prosody for a prosody changing point by shifting the estimated variation so as to correspond to the estimated absolute value and generating prosody for a portion other than prosody changing points by carrying out interpolation between the thus generated prosody for prosody changing points.
-
-
59. A program that has a computer conduct a procedure of receiving phonological information and linguistic information so as to generate prosody, the computer being operable to refer to (a) a representative prosodic pattern storage unit for accumulating beforehand representative prosodic patterns of portions of speech data, the portions including prosody changing points;
- (b) a selection rule storage unit that stores a selection rule predetermined according to attributes concerning phonology or attributes concerning linguistic information of the portions of the speech data including the prosody changing points; and
(c) a transformation rule storage unit that stores a transformation rule predetermined according to attributes concerning the phonology or the linguistic information of the portions of the speech data including the prosody changing points;
the program having the computer conduct the steps of;
setting a prosody changing point according to at least any one of the received phonological information and the linguistic information;
selecting a representative prosodic pattern from the representative prosodic pattern storage unit according to the selection rule, based on the received phonological information and the linguistic information; and
transforming the representative prosodic pattern selected by the pattern selection unit according to the transformation rule and interpolating a portion that does not include a prosody changing point and located between the thus selected and transformed representative patterns each corresponding to a portion including a prosody changing point.
- (b) a selection rule storage unit that stores a selection rule predetermined according to attributes concerning phonology or attributes concerning linguistic information of the portions of the speech data including the prosody changing points; and
-
60. A program that has a computer conduct a procedure of receiving phonological information and linguistic information so as to generate prosody, the computer being operable to refer to (a) a variation estimation rule storage unit that stores a variation estimation rule of prosody at prosody changing points, the variation estimation rule being predetermined beforehand according to attributes concerning phonology or attributes concerning linguistic information of the prosody changing points of speech data;
- and (b) an absolute value estimation rule storage unit that stores an absolute value estimation rule of the prosody at the prosody changing points, the absolute value estimation rule being predetermined beforehand according to attributes concerning the phonology or the linguistic information of the prosody changing points of the speech data;
the program having the computer conduct the steps of;
setting a prosody changing point according to at least any one of the received phonological information and the linguistic information;
estimating a variation of the prosody at the prosody changing point according to the estimation rule stored in the variation estimation rule storage unit, based on the received phonological information and the linguistic information;
estimating an absolute value of prosody at the prosody changing point according to the absolute value estimation rule stored in the absolute value estimation rule storage unit, based on the received phonological information and the linguistic information; and
generating prosody for a prosody changing point by shifting the variation estimated by the variation estimation unit so as to correspond to the absolute value obtained by the absolute value estimation unit and generating prosody for a portion other than prosody changing points by carrying out interpolation between the thus generated prosody for prosody changing points.
- and (b) an absolute value estimation rule storage unit that stores an absolute value estimation rule of the prosody at the prosody changing points, the absolute value estimation rule being predetermined beforehand according to attributes concerning the phonology or the linguistic information of the prosody changing points of the speech data;
Specification