Speech synthesis
First Claim
Patent Images
1. A speech synthesiser comprising:
- (a) means for deriving, from coded text input thereto, phonetic data indicative of the properties of a synthesis filter and accent data indicating the occurrence of accents on words and to identify phrase groups of words delimited by punctuation marks;
(b) means for deriving from the accent data a pitch contour;
(c) an excitation generator responsive to the pitch contour to produce an excitation signal of varying pitch; and
(d) filter means responsive to the phonetic data to filter the excitation signal to produce synthetic speech;
wherein each phrase group comprises one or more subgroups and the deriving means are arranged in operation in response to paragraph division within the text to produce a pitch contour which, for a given textual content, is, for each of a plurality of subgroups at the commencement of a paragraph, higher than for a subgroup at an intermediate part of a paragraph by a factor which, falls from a value greater than unity at the commencement of the paragraph to a value of unity at said intermediate part, the factor falling stepwise at the boundary between each one of said plurality of subgroups, and the subgroup which follows it.
1 Assignment
0 Petitions
Accused Products
Abstract
Coded text is converted to phonetic data to drive a synthesis filter. Accent data are also obtained to derive a pitch contour for a variable pitch excitation source. Recognition of the beginning of a paragraph causes a pitch contour of higher pitch than the pitch at a later part of the paragraph. The initial pitch falls following each subgroup into which phrases are divided. Accents within a phrase are assigned pitch values which are high for the first accent, less high for the last; and the remainder alternate between higher and lower lesser values. Accents on repeated words may be suppressed.
177 Citations
13 Claims
-
1. A speech synthesiser comprising:
-
(a) means for deriving, from coded text input thereto, phonetic data indicative of the properties of a synthesis filter and accent data indicating the occurrence of accents on words and to identify phrase groups of words delimited by punctuation marks; (b) means for deriving from the accent data a pitch contour; (c) an excitation generator responsive to the pitch contour to produce an excitation signal of varying pitch; and (d) filter means responsive to the phonetic data to filter the excitation signal to produce synthetic speech;
wherein each phrase group comprises one or more subgroups and the deriving means are arranged in operation in response to paragraph division within the text to produce a pitch contour which, for a given textual content, is, for each of a plurality of subgroups at the commencement of a paragraph, higher than for a subgroup at an intermediate part of a paragraph by a factor which, falls from a value greater than unity at the commencement of the paragraph to a value of unity at said intermediate part, the factor falling stepwise at the boundary between each one of said plurality of subgroups, and the subgroup which follows it. - View Dependent Claims (2, 13)
-
-
3. A speech synthesiser comprising:
-
(a) means for deriving, from coded text input thereto, phonetic data indicative of the properties of a synthesis filter and accent data indicating the occurrence of accents on words and to identify phase groups of words delimited by punctuation marks; (b) means for deriving from the accent data a pitch contour; (c) an excitation generator responsive to the pitch contour to produce an excitation signal of varying pitch; (d) filter means responsive to the phonetic data to filter the excitation signal to produce synthetic speech;
wherein each phrase group comprises one or more subgroups and the deriving means are arranged in operation in response to paragraph division within the text to produce a pitch contour which, for a given textual content, is for each of a plurality of subgroups at the commencement of a paragraph, higher than for a subgroup at an intermediate part of a paragraph by a factor which, falls from a value greater than unity at the commencement of the paragraph to a value of unity at said intermediate part, the factor falling stepwise at the boundary between each one of said plurality of subgroups, and the subgroup which follows it; and(e) means assigning each word to a first class having a relatively high contextual significance or a second class having a relatively lower contextual significance and the boundaries between subgroups are defined as occurring after any word of the first class which is followed by a word of the second class.
-
-
4. A speech synthesiser comprising:
-
(a) means for deriving, from coded text input thereto, phonetic data indicative of the properties of a synthesis filter and accent data indicating the occurrence of accents on words and to identify phrase groups of words delimited by punctuation marks; (b) means for deriving from the accent data a pitch contour; (c) an excitation generator responsive to the pitch contour to produce an excitation signal of varying pitch; and (d) filter means responsive to the phonetic data to filter the excitation signal to produce synthetic speech;
wherein the deriving means are arranged in operation to assign pitch representative values to the accents within each phrase group, the values comprising;(i) a first value assigned to the first accent in the group; (ii) a second value, lower than the last, assigned to the first accent in the group; and (iii) further values, lower than the first and second values, assigned to the remaining accents in the group such that the majority of those further values form a sequence in which the difference between successive values is alternately positive and negative; and to derive a pitch contour from those values; and wherein the further values consist of a third value and a fourth value lower than the third, the last of the remaining accents is assigned the fourth value, and of the other remaining accents the first and odd numbered ones are assigned the third value and the even numbered ones are assigned the fourth value. - View Dependent Claims (5)
-
-
6. A speech synthesiser comprising:
-
(a) means for deriving, from coded text input thereto, phonetic data indicative of the properties of a synthesis filter and accent data indicating the occurrence of accents on words and to identify phrase groups of words delimited by punctuation marks; (b) means for deriving from the accent data a pitch contour; (c) an excitation generator responsive to the pitch contour to produce an excitation signal of varying pitch; and (d) filter means responsive to the phonetic data to filter the excitation signal to produce synthetic speech;
wherein the deriving means are arranged in operation to assign pitch representative values to the accents within each phrase group, the values comprising;(i) a first value assigned to the first accent in the group; (ii) a second value, lower than the last, assigned to the first accent in the group; and (iii) further values, lower than the first and second values, assigned to the remaining accents in the group such that the majority of those further values form a sequence in which the difference between successive values is alternately positive and negative; and to derive a pitch contour from those values; and wherein each phrase group comprises one or more subgroups and the deriving means is arranged in operation in response to paragraph division within the text to produce a pitch contour which, for a given textual content, is, for each of a plurality of subgroups at the commencement of a paragraph higher than for a subgroup at an intermediate part of a paragraph by a factor which falls from a value greater than unity at the commencement of the paragraph to a value of unity of said intermediate part, the factor falling stepwise at the boundary between each one of said plurality of subgroups and the subgroup which follows it. - View Dependent Claims (7)
-
-
8. A speech synthesiser comprising:
-
(a) means for deriving, from coded text input thereto, phonetic data indicative of the properties of a synthesis filter and accent data indicating the occurrence of accents on words and to identify phrase groups of words delimited by punctuation marks; (b) means for deriving from the accent data a pitch contour; (c) an excitation generator responsive to the pitch contour to produce an excitation signal of varying pitch; and (d) filter means responsive to the phonetic data to filter the excitation signal to produce synthetic speech;
wherein the deriving means are arranged in operation to assign pitch representative values to the accents within each phrase group, the values comprising;(i) a first value assigned to the first accent in the group; (ii) a second value, lower than the last, assigned to the first accent in the group; and (iii) further values, lower than the first and second values, assigned to the remaining accents in the group such that the majority of those further values form a sequence in which the difference between successive values is alternately positive and negative; and to derive a pitch contour from those values; and wherein the deriving means is arranged in operation to derive the pitch contour from the values by (a) linear interpolation between the values and (b) filtering of the resulting contour.
-
-
9. A speech synthesiser comprising:
-
(a) means for deriving, from coded text input thereto, phonetic data indicative of the properties of a synthesis filter and accent data indicating the occurrence of accents on words; (b) means for deriving from the accent data a pitch contour; (c) an excitation generator responsive to the pitch contour to produce an excitation signal of varying pitch; and (d) filter means responsive to the phonetic data to filter the excitation signal to produce synthetic speech;
wherein the deriving means are arranged in operation to suppress accents on words which, in accordance with a predetermined criterion, resemble words previously processed,wherein the predetermined criterion is one of identity of words.
-
-
10. A speech synthesiser comprising:
-
(a) means for deriving, from coded text input thereto, phonetic data indicative of the properties of a synthesis filter and accent data indicating the occurrence of accents on words; (b) means for deriving from the accent data a pitch contour; (c) an excitation generator responsive to the pitch contour to produce an excitation signal of varying pitch; and (d) filter means responsive to the phonetic data to filter the excitation signal to produce synthetic speech;
wherein the deriving means are arranged in operation to suppress accents on words which, in accordance with a predetermined criterion, resemble words previously processed wherein the predetermined criterion is that the stem of the word is the same as that of the earlier word. - View Dependent Claims (11, 12)
-
Specification