Speech synthesizing system and redundancy-reduced waveform database therefor
First Claim
1. A database for use in a system for synthesizing a speech by concatenating a subset of a plurality of predetermined voice segments, the database comprising:
- a first table for associating each of said plurality of predetermined voice segments with pitch waveform IDs (identifiers) of pitch waveforms which, when combined in the listed order of said pitch waveform IDs, constitute a waveform of said each of said predetermined voice segments; and
a second table for associating each pitch waveform ID with pitch waveform data identified by said each pitch waveform ID, whereinsaid second table is obtained by dividing each of said plurality of predetermined voice segments into pitch waveforms;
classifying all of the pitch waveforms into groups of very similar pitch waveforms; and
selecting one of said very similar pitch waveforms in each of said groups for said second table and whereinsaid very similar pitch waveforms in each respective one of said groups in said first table each have a same respective pitch waveform ID.
2 Assignments
0 Petitions
Accused Products
Abstract
A speech synthesizing system using a redundancy-reduced waveform database is disclosed. Each waveform of a sample set of voice segments necessary and sufficient for speech synthesis is divided into pitch waveforms, which are classified into groups of pitch waveforms closely similar to one another. One of the pitch waveforms of each group is selected as a representative of the group and is given a pitch waveform ID. The waveform database at least comprises a pitch waveform pointer table each record of which comprises a voice segment ID of each of the voice segments and pitch waveform IDs the pitch waveforms of which, when combined in the listed order, constitute a waveform identified by the voice segment ID and a pitch waveform table of pitch waveform IDs and corresponding pitch waveforms. This enables the waveform database size to be reduced. For each of pitch waveforms the database lacks, one of the pitch waveform IDs adjacent to the lacking pitch waveform ID in the pitch waveform pointer table is used without deforming the pitch waveform.
192 Citations
15 Claims
-
1. A database for use in a system for synthesizing a speech by concatenating a subset of a plurality of predetermined voice segments, the database comprising:
-
a first table for associating each of said plurality of predetermined voice segments with pitch waveform IDs (identifiers) of pitch waveforms which, when combined in the listed order of said pitch waveform IDs, constitute a waveform of said each of said predetermined voice segments; and a second table for associating each pitch waveform ID with pitch waveform data identified by said each pitch waveform ID, wherein said second table is obtained by dividing each of said plurality of predetermined voice segments into pitch waveforms;
classifying all of the pitch waveforms into groups of very similar pitch waveforms; and
selecting one of said very similar pitch waveforms in each of said groups for said second table and whereinsaid very similar pitch waveforms in each respective one of said groups in said first table each have a same respective pitch waveform ID. - View Dependent Claims (2)
-
-
3. A database for use in a system for synthesizing a speech by concatenating some of a plurality of predetermined voice segments each defined by a phoneme-chained pattern and a pitch band, the database comprising:
-
first table means for associating each of said plurality of predetermined voice segments which is identified by one of predetermined pitch band IDs and one of predetermined phoneme-chained pattern IDs with pitch waveform IDs of pitch waveforms which, when combined in the listed order of said pitch waveform IDs, constitute a waveform of said each of said predetermined voice segments; and second table means for permitting each of said pitch waveform IDs and said one of predetermined pitch band IDs to be used to find pitch waveform data associated with said each of said pitch waveform IDs, wherein said second table means is obtained by dividing each of said plurality of predetermined voice segments into pitch waveforms;
classifying all of the pitch waveforms by phoneme and pitch band into groups of very similar pitch waveforms; and
selecting one of said very similar pitch waveforms in each of said groups for said second table means and whereinsaid very similar pitch waveforms in each respective one of said groups in said first table means each have a same respective pitch waveform ID. - View Dependent Claims (4, 5, 6)
-
-
7. A database for use in a system for synthesizing a speech by concatenating some of predetermined voice segments, the database including:
-
a first table for associating each of said predetermined voice segments with waveform IDs of pitch and voiceless sound waveforms which, when combined in the listed order of said waveform IDs, constitute a waveform of said each of said predetermined voice segments; and a second table for associating each voiceless sound waveform ID with voiceless sound waveform data identified by said each voiceless sound waveform ID, wherein voice segments containing closely similar voiceless sound waveforms have an identical waveform ID assigned to said closely similar voiceless sound waveforms in said first table, and wherein said second table is obtained by collecting said voiceless sound waveforms from said predetermined voice segments;
classifying all of said voiceless sound waveforms into groups of closely similar voiceless sound waveforms; and
selecting one of said closely similar voiceless sound waveforms in each of said groups for said second table.
-
-
8. A method of making a database for use in a system for synthesizing a speech by concatenating predetermined voice segments, the method comprising the steps of:
-
dividing each of said predetermined voice segments into pitch waveforms; classifying all of the pitch waveforms into groups of very similar pitch waveforms; selecting one of said very similar pitch waveforms in each of said groups; assigning a pitch waveform ID to said selected pitch waveform of each of said groups; creating a first table which, for each of said groups, has a record comprising said pitch waveform ID and data of said selected pitch waveform; and creating a second table whose record IDs comprise the IDs of said predetermined voice segments, each record of said second table containing pitch waveform IDs which, when combined in the listed order of said pitch waveform IDs, constitutes a waveform identified by said record ID. - View Dependent Claims (9, 10, 11, 12, 13)
-
-
14. A system for synthesizing a speech by concatenating some of predetermined voice segments, comprising:
-
means for determining IDs of necessary ones of said predetermined voice segments necessary for said speech; means for associating each of said determined ID with pitch waveform IDs the pitch waveforms of which, when combined in the listed order of said pitch waveform IDs, constitute a waveform identified by said each of said determined IDs; means for obtaining pitch waveforms associated with said pitch waveform IDs, including a pitch waveform table created by dividing each of said predetermined voice segments into pitch waveforms;
classifying all of the pitch waveforms into groups of very similar pitch waveforms; and
selecting one of said very similar pitch waveforms in each of said groups;means for combining said obtained pitch waveforms to form said necessary voice segments; and means for combining said necessary voice segments to yield said speech.
-
-
15. A system for synthesizing a speech by concatenating some of predetermined voice segments each defined by a phoneme-chained pattern and a pitch band, comprising:
-
means for determining an IDs and a pitch band of each of necessary ones of said predetermined voice segments necessary for said speech, means for associating a combination of said determined ID and said determined pitch band with pitch waveform IDs the pitch waveforms of which, when combined in the listed order of said pitch waveform IDs, constitute a waveform identified by said determined ID and said determined pitch band; means for obtaining pitch waveforms associated with said pitch waveform IDs and said determined pitch band, including a set of pitch waveforms obtained by dividing each of said predetermined voice segments into pitch waveforms;
classifying all of said divided pitch waveforms by phoneme and pitch band into groups of very similar pitch waveforms; and
selecting one of said very similar pitch waveforms in each of said groups for said set;means for combining said obtained pitch waveforms to form said necessary voice segments; and means for combining said necessary voice segments to yield said speech.
-
Specification