Systems and methods for pitch smoothing for text-to-speech synthesis
First Claim
1. A method for speech synthesis, comprising:
- generating a sequence of phonetic units representative of a target utterance;
determining a pitch contour for the target utterance, the pitch contour comprising a plurality of linear pitch contour segments, wherein each linear pitch contour segment has start and end times at anchor points of the pitch contour;
filtering the pitch contour to determine pitch values of a smooth pitch contour at the anchor points; and
determining the smooth pitch contour between adjacent anchor points by linearly interpolating between the pitch values of the smooth pitch contour at the anchor points.
1 Assignment
0 Petitions
Accused Products
Abstract
TTS synthesis systems are provided which implement computationally fast and efficient pitch contour smoothing methods for determining smooth pitch contours for non-smooth pitch contours, which closely track the non-smooth pitch contours. For example, a TTS method includes generating a sequence of phonetic units representative of a target utterance, determining a pitch contour for the target utterance, the pitch contour comprising a plurality of linear pitch contour segments, wherein each linear pitch contour segment has start and end times at anchor points of the pitch contour, filtering the pitch contour to determine pitch values of a smooth pitch contour at the anchor points, and determining the smooth pitch contour between adjacent anchor points by linearly interpolating between the pitch values of the smooth pitch contour at the anchor points.
86 Citations
23 Claims
-
1. A method for speech synthesis, comprising:
-
generating a sequence of phonetic units representative of a target utterance;
determining a pitch contour for the target utterance, the pitch contour comprising a plurality of linear pitch contour segments, wherein each linear pitch contour segment has start and end times at anchor points of the pitch contour;
filtering the pitch contour to determine pitch values of a smooth pitch contour at the anchor points; and
determining the smooth pitch contour between adjacent anchor points by linearly interpolating between the pitch values of the smooth pitch contour at the anchor points. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
-
-
15. A program storage device readable by machine, tangibly embodying a program of instructions executable by the machine to perform speech synthesis, the method steps comprising:
-
generating a sequence of phonetic units representative of a target utterance;
determining a pitch contour for the target utterance, the pitch contour comprising a plurality of linear pitch contour segments, wherein each linear pitch contour segment has start and end times at anchor points of the pitch contour;
filtering the pitch contour to determine pitch values of a smooth pitch contour at the anchor points; and
determining the smooth pitch contour between adjacent anchor points by linearly interpolating between the pitch values of the smooth pitch contour at the anchor points.
-
-
16. A text-to-speech synthesis system, comprising:
-
a text processing system for processing textual data and phonetically transcribing the textual data into a sequence of phonetic units representative of a target utterance to be synthesized;
a prosody processing system for determining a pitch contour for the target utterance comprising a plurality of linear pitch contour segments having start and end times at anchor points of the pitch contour, and for determining a smooth pitch contour by filtering the pitch contour to determine pitch values of the smooth pitch contour at the anchor points, and linearly interpolating between the pitch values of the smooth pitch contour at the anchor points; and
a signal synthesizing system for generating an acoustic waveform representation of the target utterance using the smooth pitch contour for the target utterance. - View Dependent Claims (17, 18, 19, 20, 21, 22, 23)
-
Specification