System and method for determining pitch contours
First Claim
1. A method for determining an acoustical contour for a speech interval having a predetermined duration, said acoustical contour being functionally related to a speech waveform processed by a computerized speech processing application, said method comprising the steps of:
- dividing said duration of said speech interval into a plurality of critical intervals;
determining a plurality of anchor times within said speech interval duration, said anchor times being functionally related to said critical intervals;
for each of said anchor times, finding a corresponding anchor value from a look-up table;
representing each of said anchor values as an ordinate in a Cartesian coordinate system having as an abscissa said corresponding anchor time;
fitting a curve to said Cartesian representations of said anchor values; and
multiplying said fitted curve by at least one predetermined numerical constant related to a linguistic factor to create a product curve, said product curve being representative of said acoustical contour;
wherein said acoustical contour is provided as an input to said speech processing application.
6 Assignments
0 Petitions
Accused Products
Abstract
A system and method are provided for automatically computing local pitch contours from textual input to produce pitch contours that closely mimic those found in natural speech. The methodology of the invention incorporates parameterized equations whose parameters can be estimated directly from natural speech recordings. That methodology incorporates a model based on the premise that pitch contours instantiating a particular pitch contour class can be described as distortions in the temporal and frequency domains of a single, underlying contour. After the nature of the pitch contour for different pitch contour classes has been established, a pitch contour can be predicted that closely models a natural speech contour for a synthetic speech utterance by adding the individual contours of the different intonational classes and adjusting the boundaries of these to match the boundaries of the adjacent intonation curves.
254 Citations
25 Claims
-
1. A method for determining an acoustical contour for a speech interval having a predetermined duration, said acoustical contour being functionally related to a speech waveform processed by a computerized speech processing application, said method comprising the steps of:
-
dividing said duration of said speech interval into a plurality of critical intervals; determining a plurality of anchor times within said speech interval duration, said anchor times being functionally related to said critical intervals; for each of said anchor times, finding a corresponding anchor value from a look-up table; representing each of said anchor values as an ordinate in a Cartesian coordinate system having as an abscissa said corresponding anchor time; fitting a curve to said Cartesian representations of said anchor values; and multiplying said fitted curve by at least one predetermined numerical constant related to a linguistic factor to create a product curve, said product curve being representative of said acoustical contour;
wherein said acoustical contour is provided as an input to said speech processing application. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 25)
-
-
14. A system for determining an acoustical contour for a speech interval having a predetermined duration, wherein said acoustical contour is functionally related to a speech waveform processed by a computerized speech processing application, said system comprising:
-
processing means for dividing said duration of said speech interval into a plurality of critical intervals; processing means for determining a plurality of anchor times within said speech interval duration, said anchor times being functionally related to said critical intervals; means for finding an anchor value corresponding to each of said anchor times, said anchor values being stored in a storage means, for representing each of said anchor values as an ordinate in a Cartesian coordinate system having as an abscissa said corresponding anchor time, and for fitting a curve to said Cartesian representations of said anchor values; and means for multiplying said fitted curve by at least one predetermined numerical constant related to a linguistic factor to create a product curve, said product curve being representative of said acoustical contour;
wherein said acoustical contour is provided as an input to said speech processing application. - View Dependent Claims (15, 16, 17, 18, 19, 20, 21, 22, 23, 24)
-
Specification