Speech synthesis using multi-mode coding with a speech segment dictionary

US 7,092,878 B1
Filed: 08/01/2000
Issued: 08/15/2006
Est. Priority Date: 08/03/1999
Status: Expired due to Fees

First Claim

Patent Images

1. A speech information processing method of generating a speech segment dictionary for holding a plurality of speech segments, comprising:

a first encoding step of encoding a speech segment;

a calculation step of calculating an encoding distortion produced at said first encoding step;

a storage step of storing the encoded speech segment encoded in said first encoding step in the speech segment dictionary, in a case where the encoding distortion produced at said first encoding step is less than a predetermined value;

a second encoding step of encoding the speech segment, in a case where the encoding distortion produced at said first encoding step is not less than the predetermined threshold value; and

a storing step of storing the encoded speech segment encoded in said second encoding step in the speech segment dictionary.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Speech segment data are encoded in accordance with their respective optimum encoding schemes. The speech segment data thus encoded are registered in a speech segment dictionary along with information specifying the encoding methods used in the encoding.

Citations

10 Claims

1. A speech information processing method of generating a speech segment dictionary for holding a plurality of speech segments, comprising:
- a first encoding step of encoding a speech segment;
  
  a calculation step of calculating an encoding distortion produced at said first encoding step;
  
  a storage step of storing the encoded speech segment encoded in said first encoding step in the speech segment dictionary, in a case where the encoding distortion produced at said first encoding step is less than a predetermined value;
  
  a second encoding step of encoding the speech segment, in a case where the encoding distortion produced at said first encoding step is not less than the predetermined threshold value; and
  
  a storing step of storing the encoded speech segment encoded in said second encoding step in the speech segment dictionary.

2. A speech information processing method of generating a speech segment dictionary for holding a plurality of encoded speech segments, comprising:
- a construction step of constructing quantization code books using speech segments stored in a speech database;
  
  an encoding step of encoding the speech segments stored in the speech database using the quantization code books that were constructed using the speech segments stored in the speech database; and
  
  a storage step of storing in he speech segment dictionary, the encoded speech segments that were encoded in said encoding step.

3. A speech information processing method of generating a speech segment dictionary for holding a plurality of speech segments, comprising:
- a selection step of selecting an encoding method of encoding a speech segment from a plurality of encoding methods;
  
  an encoding step of encoding the speech segment by using the selected encoding method; and
  
  a storage step of storing the encoded speech segment in a speech segment dictionary, wherein the selected encoding method uses a μ
  
  -law scheme, scalar quantization, and linear predictive coding.

4. A speech information processing apparatus for generating a speech segment dictionary for holding a plurality of speech segments, comprising:
- selecting means for selecting an encoding method of encoding a speech segment from a plurality of encoding methods;
  
  encoding means for encoding the speech segment by using the selected encoding method;
  
  calculation means for calculating an encoding distortion produced by said encoding means;
  
  selection means for selecting an encoding method of the plurality of encoding methods in which the encoding distortion is smallest; and
  
  storage means for storing the encoded speech segment encoded using the encoding method selected by said selection means, in the speech segment dictionary, wherein the selected encoding method uses a iμ
  
  -law scheme, scalar quantization, and linear predictive coding.

5. A speech information processing method of synthesizing speech by using a speech segment dictionary for holding a plurality of encoded speech segments, comprising:
- a construction step of constructing quantization code books using speech segments stored in a speech database;
  
  an encoding step of encoding the speech segments stored in the speech database using the quantization code books that were constructed using the speech segments stored in the speech database;
  
  a storage step of storing in the speech segment dictionary, the encoded speech segments that were encoded in said encoding step; and
  
  a decoding step of decoding the encoded speech segments by using the quantization code books constructed in said construction step.

6. A speech information processing method of synthesizing speech by using a speech segment dictionary for holding a plurality of speech segments, comprising:
- a selection step of selecting an encoding method of encoding a speech segment from a plurality of encoding methods;
  
  an encoding step of encoding the speech segment by using the selected encoding method; and
  
  a storage step of storing the encoded speech segment in a speech segment dictionary, wherein the selected encoding method uses a μ
  
  law scheme, scalar quantization, and linear predictive coding.

7. A speech information processing apparatus for synthesizing speech by using a speech segment dictionary for holding a plurality of speech segments, comprising:
- decoding means for decoding the speech segment by using a decoding step of decoding the speech segment by using a plurality of decoding methods for decoding the speech segment;
  
  calculation means for calculating a decoding distortion produced by said decoding means;
  
  selection means for selecting a decoding method of the plurality of decoding methods in which the decoding distortion is smallest; and
  
  speech synthesizing means for synthesizing speech on the basis of the decoded speech segment decoded by the decoding method selected by said selection means, wherein the selected decoding method uses a μ
  
  -law scheme, scalar quantization, and linear predictive coding.

8. A speech information processing apparatus for generating a speech segment dictionary for holding a plurality of speech segments, comprising:
- first encoding means for encoding a speech segment;
  
  calculating means for calculating an encoding distortion produced by said first encoding means;
  
  storage means for storing the encoded speech segment encoded by said first encoding means in the speech segment dictionary, in a case where the encoding distortion produced by said first encoding means is less than a predetermined value;
  
  second encoding means for encoding the speech segment, in a case where the encoding distortion produced by said first encoding means is not less than the predetermined threshold value; and
  
  storage means for storing the encoded speech segment encoded by said second encoding means in the speech segment dictionary.

9. A speech information processing apparatus for generating a speech segment dictionary for holding a plurality of encoded speech segments, comprising:
- construction means for constructing quantization code books using one or more speech segments stored in a speech database;
  
  encoding means for encoding the speech segments stored in the speech database using the quantization code books that were constructed using the speech segments stored in the speech database; and
  
  storage means for storing in the speech segment dictionary, the encoded speech segments that were encoded by said encoding means.

10. A speech information processing apparatus for synthesizing speech by using a speech segment dictionary for holding a plurality of encoded speech segments, comprising:
- construction means for constructing quantization code books using speech segments stored in a speech database;
  
  encoding means for encoding the speech segments stored in the speech database using the quantization code books that were constructed using the speech segments stored in the speech database; and
  
  storage means for storing in the speech segment dictionary, the encoded speech segments that were encoded by said encoding means; and
  
  decoding means for decoding the encoded speech segments by using the quantization code books constructed by said construction means.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Canon Kabushiki Kaisha (Canon Inc.)
Original Assignee
Canon Kabushiki Kaisha (Canon Inc.)
Inventors
Yamada, Masayuki
Primary Examiner(s)
Chawan, Vijay B.

Application Number

US09/630,356
Time in Patent Office

2,205 Days
Field of Search

704200-226, 704/230, 704/270
US Class Current

704/230
CPC Class Codes

G10L 13/06 Elementary speech units use...

Speech synthesis using multi-mode coding with a speech segment dictionary

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

10 Claims

Specification

Solutions

Use Cases

Quick Links

Speech synthesis using multi-mode coding with a speech segment dictionary

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

10 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links