Low data rate speech encoding employing syllable pitch patterns
First Claim
1. A speech encoding apparatus comprising:
- input means for receiving speech including one or more words of human language;
analysis means connected to said input means for analyzing said received speech, generating a sequence of phonological linguistic unit indicia corresponding to said received speech, grouping said phonological linguistic unit indicia into syllables, and generating pitch track data corresponding to said received speech;
pitch pattern memory means storing a plurality of predetermined pitch patterns;
pitch pattern recognizer means connected to said analysis means and to said pitch pattern memory means for selecting a pitch pattern from said plurality of predetermined pitch patterns for each syllable grouping of phonological linguistic unit indicia as generated by said analysis means, said pitch pattern being selected in dependence upon said pitch track data corresponding to each syllable grouping of phonological linguistic unit indicia; and
transmission means connected to said analysis means and said pitch pattern recognizer means for transmitting said phonological linguistic unit indicia and pitch pattern indicia corresponding to said selected pitch patterns.
1 Assignment
0 Petitions
Accused Products
Abstract
The present invention is a speech encoding technique useful in low data rate speech. Spoken input is analyzed to determine its basic phonological linguistic units and syllables. The pitch track for each syllable is compared with each of a predetermined set of pitch patterns. A pitch pattern forming the best match to the actual pitch track is selected for each syllable. Phonological linguistic unit indicia and pitch pattern indicia are transmitted to a speech synthesis apparatus. This synthesis apparatus matches the pitch pattern indicia to syllable groupings of the phonological linguistic unit indicia. During speech synthesis, sounds are produced corresponding to the phonological linguistic unit indicia with their primary pitch controlled by the pitch pattern indicia of the corresponding syllable. This achieves some measure of approximation to the primary pitch of the original spoken input at a low data rate. In the preferred embodiment, each pitch pattern includes an initial pitch slope, which may be zero indicating no change in pitch, a final pitch slope and a turning point between these two slopes.
180 Citations
10 Claims
-
1. A speech encoding apparatus comprising:
-
input means for receiving speech including one or more words of human language; analysis means connected to said input means for analyzing said received speech, generating a sequence of phonological linguistic unit indicia corresponding to said received speech, grouping said phonological linguistic unit indicia into syllables, and generating pitch track data corresponding to said received speech; pitch pattern memory means storing a plurality of predetermined pitch patterns; pitch pattern recognizer means connected to said analysis means and to said pitch pattern memory means for selecting a pitch pattern from said plurality of predetermined pitch patterns for each syllable grouping of phonological linguistic unit indicia as generated by said analysis means, said pitch pattern being selected in dependence upon said pitch track data corresponding to each syllable grouping of phonological linguistic unit indicia; and transmission means connected to said analysis means and said pitch pattern recognizer means for transmitting said phonological linguistic unit indicia and pitch pattern indicia corresponding to said selected pitch patterns. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
Specification