Phonetic unit duration adjustment for text-to-speech system
First Claim
Patent Images
1. A speech synthesis method comprising:
- supplying a sequence or representations of phonetic units;
retrieving stored portions of data to generate waveforms corresponding to the phonetic units;
determining durations for the phonetic units; and
processing the portions of data to adjust the time durations of the waveforms according to the determined durations;
wherein the determining step is operable to define a constant duration for said phonetic unit, said constant duration corresponding to a regular beat period and selectively in dependence on the intrinsic duration of the phonetic unit and/or its context within the sequence, to carry out a constant duration regulation calculation.
1 Assignment
0 Petitions
Accused Products
Abstract
Input text is converted to a sequence of representations of syllables or other phonetic units and stored portions of data are retrieved to generate waveforms corresponding to the syllables. In order to determine durations for the syllables, a constant duration is defined corresponding to a regular beat period and adjusted in accordance with the nature of the syllable and/or its context within the sequence.
-
Citations
26 Claims
-
1. A speech synthesis method comprising:
-
supplying a sequence or representations of phonetic units;
retrieving stored portions of data to generate waveforms corresponding to the phonetic units;
determining durations for the phonetic units; and
processing the portions of data to adjust the time durations of the waveforms according to the determined durations;
wherein the determining step is operable to define a constant duration for said phonetic unit, said constant duration corresponding to a regular beat period and selectively in dependence on the intrinsic duration of the phonetic unit and/or its context within the sequence, to carry out a constant duration regulation calculation. - View Dependent Claims (2, 3, 4, 5)
identifying major phrases in said sequence;
wherein the determining step further adjusts said durations for the phonetic units in dependence upon the number of phonetic units falling within a major phrase.
-
-
3. A speech synthesis method as in claim 1 in which the phonetic units are syllables.
-
4. A speech synthesis method as in claim 1 including:
-
storing items of data representing waveforms corresponding to phonetic sub-units, the retrieving step retrieving for each phonetic unit, one or more portions of data each corresponding to a sub-unit thereof, and further storing for each sub-unit statistical duration data including a maximum value and a minimum value;
wherein the determining step computes for each phonetic unit the sum of the minimum duration values and the sum of the maximum duration values for the constituent sub-unit(s) thereof and adjusts the said constant duration such that it neither falls below the sum of the minimum values nor exceeds the sum of the maximum values.
-
-
5. A speech synthesis method as in claim 4 in which the sub-units are phonemes.
-
6. A speech synthesis method comprising:
-
supplying a sequence of representations of phonetic units;
retrieving stored portions of data to generate waveforms corresponding to the phonetic units;
determining durations for the phonetic units;
processing the portions of data to adjust the time durations of the waveforms according to the determined durations;
wherein the determining step is operable to define a constant duration corresponding to a regular beat period and to adjust that duration in dependence on the intrinsic duration of the phonetic unit and/or its context within the sequence, storing items of data representing waveforms corresponding to phonetic sub-units, the retrieving step retrieving for each phonetic unit, one or more portions of data each corresponding to a sub-unit thereof, and further storing for each sub-unit statistical duration data including a maximum value and a minimum value;
wherein the determining step computes for each phonetic unit the sum of the minimum duration values and the sum of the maximum duration values for the constituent sub-unit(s) thereof and adjusts the said constant duration such that it neither falls below the sum of the minimum values nor exceeds the sum of the maximum values;
wherein said determining step adjusts the said constant duration value such that it does not fall below a modified minimum value which exceeds the sum of the minimum values to an extent determined by the context of the phonetic unit.
-
-
7. A speech synthesis method comprising:
-
supplying a sequence of representations of phonetic units;
retrieving stored portions of data to generate waveforms corresponding to the phonetic units;
determining durations for the phonetic units;
processing the portions of data to adjust the time durations of the waveforms according to the determined durations;
wherein the determining step is operable to define a constant duration corresponding to a regular beat period and to adjust that duration in dependence on the intrinsic duration of the phonetic unit and/or its context within the sequence, storing items of data representing waveforms corresponding to phonetic sub-units, the retrieving step retrieving for each phonetic unit, one or more portions of data each corresponding to a sub-unit thereof, and further storing for each sub-unit statistical duration data including a maximum value and a minimum value;
wherein the determining step computes for each phonetic unit the sum of the minimum duration values and the sum of the maximum duration values for the constituent sub-unit(s) thereof and adjusts the said constant duration such that it neither falls below the sum of the minimum values nor exceeds the sum of the maximum values;
wherein the statistical duration data include for each sub-unit a central value, and each sub-unit of a phonetic unit is assigned a duration which is a fraction of the adjusted constant value for that phonetic unit in proportion to the ratio of the central value for that sub-unit to the sum of the central values for the constituent sub-units of that phonetic unit.
-
-
8. A speech synthesis method comprising:
-
supplying a sequence of representations of phonetic units;
retrieving stored portions of data to generate waveforms corresponding to the phonetic units;
determining durations for the phonetic units; and
processing the portions of data to adjust the time durations of the waveforms according to the determined durations;
wherein the determining step is operable to;
a) determine bounds for said duration, said bounds depending on the intrinsic duration of the phonetic unit and/or its context within the sequence; and
b) assign a constant duration corresponding to a regular beat period to said phonetic unit provided said constant duration does not transgress said bounds. - View Dependent Claims (9)
-
-
10. A speech synthesis method comprising:
-
supplying a sequence of representations of phonetic units;
retrieving stored portions of data to generate waveforms corresponding to the phonetic units;
determining durations for the phonetic units; and
processing the portions of data to adjust the time durations of the waveforms according to the determined durations;
wherein the determining step is operable to;
a) determine bounds for said duration, said bounds depending on the intrinsic duration of the phonetic unit and/or its context within the sequence; and
b) assign a constant duration corresponding to a regular beat period to said phonetic unit provided said constant duration does not transgress said bounds, the retrieving step retrieving for each phonetic unit, one or more portions of data each corresponding to a sub-unit thereof, and the determining step computing for each phonetic unit the sum of minimum duration values and the sum of maximum duration values for the constituent sub-unit(s) thereof and correcting the said constant duration if the computed constant duration falls below the sum of the minimum values or exceeds the sum of the maximum values. - View Dependent Claims (11, 12, 13)
the statistical duration data include for each sub-unit a central value, and including assigning to each sub-unit of a phonetic unit a duration which is a fraction of the adjusted constant value for that phonetic unit in proportion to the ratio of the central value for that sub-unit to the sum of the central values for the constituent sub-units of that phonetic unit.
-
-
14. A speech synthesiser comprising:
-
means for supplying a sequence of representations of phonetic units;
means for retrieving stored portions of data to generate waveforms corresponding to the phonetic units;
means for determining durations for the phonetic units; and
means for processing the portions of data to adjust the time durations of the waveforms according to the determined durations;
wherein the determining means is operable to define a constant duration for said phonetic unit, said constant duration corresponding to a regular beat period and selectively in dependence on the intrinsic duration of the phonetic unit and/or its context within the sequence, to carry out a constant duration regulation calculation. - View Dependent Claims (15, 16, 17, 18)
means for identifying major phrases in said sequence;
wherein the determining means further adjust said durations for the phonetic units in dependence upon the number of phonetic units falling within a major phrase.
-
-
16. A speech synthesiser as in claim 14 in which the phonetic units are syllables.
-
17. A speech synthesis as in claim 14 including:
-
a store containing items of data representing waveforms corresponding to phonetic sub-units, the retrieving means being operable to retrieve, for each phonetic unit one or more portions of data each corresponding to a sub-unit thereof, and a further store containing for each sub-unit statistical duration data including a maximum value and a minimum value, wherein the determining means is operable to compute for each phonetic unit the sum of the minimum duration values and the sum of the maximum duration values for the constituent sub-unit(s) thereof and to adjust the said constant duration such that it neither falls below the sum of the minimum values nor exceeds the sum of the maximum values.
-
-
18. A speech synthesiser as in claim 17 in which the sub-units are phonemes.
-
19. A speech synthesiser comprising:
-
means for supplying a sequence of representations of phonetic units;
means for retrieving stored portions of data to generate waveforms corresponding to the phonetic units;
means for determining durations for the phonetic units;
means for processing the portions of data to adjust the time durations of the waveforms according to the determined durations;
wherein the determining means is operable to define a constant duration corresponding to a regular beat period and to adjust that duration in dependence on the nature of the phonetic unit and/or its context within the sequence;
a store containing items of data representing waveforms corresponding to phonetic sub-units, the retrieving means being operable to retrieve, for each phonetic unit, one or more portions of data each corresponding to a sub-unit thereof, and a further store containing for each sub-unit statistical duration data including a maximum value and a minimum value, wherein the determining means is operable to compute for each phonetic unit the sum of the minimum duration values and the sum of the maximum duration values for the constituent sub-unit(s) thereof and to adjust the said constant duration such that it neither falls below the sum of the minimum values nor exceeds the sum of the maximum values; and
wherein the determining means is operable to adjust the said constant duration value such that it does not fall below a modified minimum value which exceeds the sum of the minimum values to an extent determined by the context of the phonetic unit.
-
-
20. A speech synthesiser comprising:
-
means for supplying a sequence of representations of phonetic units;
means for retrieving stored portions of data to generate waveforms corresponding to the phonetic units;
means for determining durations for the phonetic units;
means for processing the portions of data to adjust the time durations of the waveforms according to the determined durations;
wherein the determining, means is operable to define a constant duration corresponding to a regular beat period and to adjust that duration in dependence on the nature of the phonetic unit and/or its context within the sequence;
a store containing items of data representing waveforms corresponding to phonetic sub-units, the retrieving means being operable to retrieve, for each phonetic unit, one or more portions of data each corresponding to sub-unit thereof, and a further store containing for each sub-unit statistical duration data including a maximum value and a minimum value, wherein the determining means is operable to compute for each phonetic unit the sum of the minimum duration values and the sum of the maximum duration values for the constituent sub-unit(s) thereof and to adjust the said constant duration such that it neither falls below the sum of the minimum values nor exceeds the sum of the maximum values; and
wherein the statistical duration data include for each sub-unit a central value, and means to assign to each sub-unit of a phonetic unit a duration which is a fraction of the adjusted constant value for that phonetic unit in proportion to the ratio of the central value for that sub-unit to the sum of the central values for the constituent sub-units of that phonetic unit.
-
-
21. A speech synthesizer comprising:
-
means for supplying a sequence of representations of phonetic units;
means for retrieving stored portions of data to generate waveforms corresponding to the phonetic units;
means for determining durations for the phonetic units; and
means for processing the portions of data to adjust the time durations of the waveforms according to the determined durations;
wherein the determining means is operable to;
a) determine bounds for said duration, said bounds depending on the intrinsic duration of the phonetic unit and/or its context within the sequence; and
b) assign a constant duration corresponding to a regular beat period to said phonetic unit provided said constant duration does not transgress said bounds. - View Dependent Claims (22)
-
-
23. A speech synthesizer comprising:
-
means for supplying a sequence of representations of phonetic units;
means for retrieving stored portions of data to generate waveforms corresponding to the phonetic units;
means for determining durations for the phonetic units; and
means for processing the portions of data to adjust the time durations of the waveforms according to the determined durations;
wherein the determining means is operable to;
a) determine bounds for said duration, said bounds depending on the intrinsic duration of the phonetic unit and/or its context within the sequence; and
b) assign a constant duration corresponding to a regular beat period to said phonetic unit provided said constant duration does not transgress said bounds, a store containing items of data representing waveforms corresponding to phonetic sub-units, the retrieving means being operable to retrieve, for each phonetic unit, one or more portions of data each corresponding to a sub-unit thereof, and a further store containing for each sub-unit statistical duration data including a maximum value and a minimum value, wherein the determining means is operable to compute for each phonetic unit the sum of the minimum duration values and the sum of the maximum duration values for the constituent sub-unit(s) thereof and to correct the said constant duration if the computed constant duration falls below the sum of minimum values or exceeds the sum of the maximum values. - View Dependent Claims (24, 25, 26)
the statistical duration data include for each sub-unit a central value, and including means to assign to each sub-unit of a phonetic unit a duration which is a fraction of the adjusted constant value for that phonetic unit is proportion to the ratio of the central value for that sub-unit to the sum of the central values for the constituent sub-units of that phonetic unit.
-
Specification