Speech synthesizing system and redundancy-reduced waveform database therefor

US 6,125,346 A
Filed: 12/05/1997
Issued: 09/26/2000
Est. Priority Date: 12/10/1996
Status: Expired due to Term

First Claim

Patent Images

1. A database for use in a system for synthesizing a speech by concatenating a subset of a plurality of predetermined voice segments, the database comprising:

a first table for associating each of said plurality of predetermined voice segments with pitch waveform IDs (identifiers) of pitch waveforms which, when combined in the listed order of said pitch waveform IDs, constitute a waveform of said each of said predetermined voice segments; and

a second table for associating each pitch waveform ID with pitch waveform data identified by said each pitch waveform ID, whereinsaid second table is obtained by dividing each of said plurality of predetermined voice segments into pitch waveforms;

classifying all of the pitch waveforms into groups of very similar pitch waveforms; and

selecting one of said very similar pitch waveforms in each of said groups for said second table and whereinsaid very similar pitch waveforms in each respective one of said groups in said first table each have a same respective pitch waveform ID.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A speech synthesizing system using a redundancy-reduced waveform database is disclosed. Each waveform of a sample set of voice segments necessary and sufficient for speech synthesis is divided into pitch waveforms, which are classified into groups of pitch waveforms closely similar to one another. One of the pitch waveforms of each group is selected as a representative of the group and is given a pitch waveform ID. The waveform database at least comprises a pitch waveform pointer table each record of which comprises a voice segment ID of each of the voice segments and pitch waveform IDs the pitch waveforms of which, when combined in the listed order, constitute a waveform identified by the voice segment ID and a pitch waveform table of pitch waveform IDs and corresponding pitch waveforms. This enables the waveform database size to be reduced. For each of pitch waveforms the database lacks, one of the pitch waveform IDs adjacent to the lacking pitch waveform ID in the pitch waveform pointer table is used without deforming the pitch waveform.

192 Citations

15 Claims

1. A database for use in a system for synthesizing a speech by concatenating a subset of a plurality of predetermined voice segments, the database comprising:
- a first table for associating each of said plurality of predetermined voice segments with pitch waveform IDs (identifiers) of pitch waveforms which, when combined in the listed order of said pitch waveform IDs, constitute a waveform of said each of said predetermined voice segments; and
  
  a second table for associating each pitch waveform ID with pitch waveform data identified by said each pitch waveform ID, whereinsaid second table is obtained by dividing each of said plurality of predetermined voice segments into pitch waveforms;
  
  classifying all of the pitch waveforms into groups of very similar pitch waveforms; and
  
  selecting one of said very similar pitch waveforms in each of said groups for said second table and whereinsaid very similar pitch waveforms in each respective one of said groups in said first table each have a same respective pitch waveform ID.
- View Dependent Claims (2)
- - 2. A database as defined in claim 1, wherein all of the pitch waveform data in the database have a same phase characteristic.

3. A database for use in a system for synthesizing a speech by concatenating some of a plurality of predetermined voice segments each defined by a phoneme-chained pattern and a pitch band, the database comprising:
- first table means for associating each of said plurality of predetermined voice segments which is identified by one of predetermined pitch band IDs and one of predetermined phoneme-chained pattern IDs with pitch waveform IDs of pitch waveforms which, when combined in the listed order of said pitch waveform IDs, constitute a waveform of said each of said predetermined voice segments; and
  
  second table means for permitting each of said pitch waveform IDs and said one of predetermined pitch band IDs to be used to find pitch waveform data associated with said each of said pitch waveform IDs, whereinsaid second table means is obtained by dividing each of said plurality of predetermined voice segments into pitch waveforms;
  
  classifying all of the pitch waveforms by phoneme and pitch band into groups of very similar pitch waveforms; and
  
  selecting one of said very similar pitch waveforms in each of said groups for said second table means and whereinsaid very similar pitch waveforms in each respective one of said groups in said first table means each have a same respective pitch waveform ID.
- View Dependent Claims (4, 5, 6)
- - 4. A database as defined in claim 3, wherein said first table means comprises tables by phoneme-chained patterns, each record of each of said table comprising one of said predetermined pitch band IDs and pitch waveform IDs of pitch waveforms which, when combined in the listed order of said pitch waveform IDs, constitute a waveform characterized by a phoneme-chained pattern associated with said each of said table and by said one of said predetermined pitch band IDs.
  - 5. A database as defined in claim 3, wherein:
    - said second table means comprises table groups by phonemes constituting phoneme-chained patterns identified by phoneme-chained pattern IDs;
      
      each of said table groups comprises tables identified by said predetermined pitch band IDs; and
      
      each record of each of said tables comprises one of pitch waveform IDs of pitch waveforms of a phoneme-chained pattern and a pitch band associated with said each of said tables and a pitch waveform associated with said one of said pitch waveform IDs.
  - 6. A database as defined in claim 3, wherein all of the pitch waveform data in the database have a same phase characteristic.

7. A database for use in a system for synthesizing a speech by concatenating some of predetermined voice segments, the database including:
- a first table for associating each of said predetermined voice segments with waveform IDs of pitch and voiceless sound waveforms which, when combined in the listed order of said waveform IDs, constitute a waveform of said each of said predetermined voice segments; and
  
  a second table for associating each voiceless sound waveform ID with voiceless sound waveform data identified by said each voiceless sound waveform ID, wherein voice segments containing closely similar voiceless sound waveforms have an identical waveform ID assigned to said closely similar voiceless sound waveforms in said first table, and whereinsaid second table is obtained by collecting said voiceless sound waveforms from said predetermined voice segments;
  
  classifying all of said voiceless sound waveforms into groups of closely similar voiceless sound waveforms; and
  
  selecting one of said closely similar voiceless sound waveforms in each of said groups for said second table.

8. A method of making a database for use in a system for synthesizing a speech by concatenating predetermined voice segments, the method comprising the steps of:
- dividing each of said predetermined voice segments into pitch waveforms;
  
  classifying all of the pitch waveforms into groups of very similar pitch waveforms;
  
  selecting one of said very similar pitch waveforms in each of said groups;
  
  assigning a pitch waveform ID to said selected pitch waveform of each of said groups;
  
  creating a first table which, for each of said groups, has a record comprising said pitch waveform ID and data of said selected pitch waveform; and
  
  creating a second table whose record IDs comprise the IDs of said predetermined voice segments, each record of said second table containing pitch waveform IDs which, when combined in the listed order of said pitch waveform IDs, constitutes a waveform identified by said record ID.
- View Dependent Claims (9, 10, 11, 12, 13)
- - 9. A method as defined in claim 8, wherein said step of classifying all of the pitch waveforms comprises the step of classifying all of the pitch waveforms by spectrum parameter of each of said pitch waveforms.
  - 10. A method as defined in claim 8, wherein said step of selecting one of said very similar pitch waveforms in each of said groups comprises the step of selecting a pitch waveform of the largest power in each of said groups.
  - 11. A method as defined in claim 8, wherein said step of selecting one of said very similar pitch waveforms in each of said groups is achieved such that all of the selected pitch waveforms have the same phase characteristic.
  - 12. A method as defined in claim 8, wherein said step of creating a first table comprises using the data of only the respective selected pitch waveforms in the records for the respective groups, thereby excluding from the database pitch waveforms very similar to the selected pitch waveforms and grouped therewith.
  - 13. A method as defined in claim 12, wherein said step of assigning a pitch waveform ID comprises assigning said pitch waveform ID only to the one selected pitch waveform of each of said groups.

14. A system for synthesizing a speech by concatenating some of predetermined voice segments, comprising:
- means for determining IDs of necessary ones of said predetermined voice segments necessary for said speech;
  
  means for associating each of said determined ID with pitch waveform IDs the pitch waveforms of which, when combined in the listed order of said pitch waveform IDs, constitute a waveform identified by said each of said determined IDs;
  
  means for obtaining pitch waveforms associated with said pitch waveform IDs, includinga pitch waveform table created by dividing each of said predetermined voice segments into pitch waveforms;
  
  classifying all of the pitch waveforms into groups of very similar pitch waveforms; and
  
  selecting one of said very similar pitch waveforms in each of said groups;
  
  means for combining said obtained pitch waveforms to form said necessary voice segments; and
  
  means for combining said necessary voice segments to yield said speech.

15. A system for synthesizing a speech by concatenating some of predetermined voice segments each defined by a phoneme-chained pattern and a pitch band, comprising:
- means for determining an IDs and a pitch band of each of necessary ones of said predetermined voice segments necessary for said speech,means for associating a combination of said determined ID and said determined pitch band with pitch waveform IDs the pitch waveforms of which, when combined in the listed order of said pitch waveform IDs, constitute a waveform identified by said determined ID and said determined pitch band;
  
  means for obtaining pitch waveforms associated with said pitch waveform IDs and said determined pitch band, including a set of pitch waveforms obtained by dividing each of said predetermined voice segments into pitch waveforms;
  
  classifying all of said divided pitch waveforms by phoneme and pitch band into groups of very similar pitch waveforms; and
  
  selecting one of said very similar pitch waveforms in each of said groups for said set;
  
  means for combining said obtained pitch waveforms to form said necessary voice segments; and
  
  means for combining said necessary voice segments to yield said speech.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Panasonic Intellectual Property Corporation of America (Panasonic Holdings Corporation)
Original Assignee
Matsushita Electric Industrial Company Limited (Panasonic Holdings Corporation)
Inventors
Arai, Yasuhiko, Nishimura, Hirofumi, Minowa, Toshimitsu
Primary Examiner(s)
Hudspeth, David R.
Assistant Examiner(s)
Azad, Abul K.

Application Number

US08/985,899
Time in Patent Office

1,026 Days
Field of Search

704/205, 704/207, 704/258, 704/268, 704/267, 707/100
US Class Current

704/258
CPC Class Codes

G10L 13/07 Concatenation rules

Speech synthesizing system and redundancy-reduced waveform database therefor

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

192 Citations

15 Claims

Specification

Solutions

Use Cases

Quick Links

Speech synthesizing system and redundancy-reduced waveform database therefor

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

192 Citations

15 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links