Speech synthesis using multi-mode coding with a speech segment dictionary
First Claim
Patent Images
1. A speech information processing method of generating a speech segment dictionary for holding a plurality of speech segments, comprising:
- a first encoding step of encoding a speech segment;
a calculation step of calculating an encoding distortion produced at said first encoding step;
a storage step of storing the encoded speech segment encoded in said first encoding step in the speech segment dictionary, in a case where the encoding distortion produced at said first encoding step is less than a predetermined value;
a second encoding step of encoding the speech segment, in a case where the encoding distortion produced at said first encoding step is not less than the predetermined threshold value; and
a storing step of storing the encoded speech segment encoded in said second encoding step in the speech segment dictionary.
1 Assignment
0 Petitions
Accused Products
Abstract
Speech segment data are encoded in accordance with their respective optimum encoding schemes. The speech segment data thus encoded are registered in a speech segment dictionary along with information specifying the encoding methods used in the encoding.
-
Citations
10 Claims
-
1. A speech information processing method of generating a speech segment dictionary for holding a plurality of speech segments, comprising:
-
a first encoding step of encoding a speech segment;
a calculation step of calculating an encoding distortion produced at said first encoding step;
a storage step of storing the encoded speech segment encoded in said first encoding step in the speech segment dictionary, in a case where the encoding distortion produced at said first encoding step is less than a predetermined value;
a second encoding step of encoding the speech segment, in a case where the encoding distortion produced at said first encoding step is not less than the predetermined threshold value; and
a storing step of storing the encoded speech segment encoded in said second encoding step in the speech segment dictionary.
-
-
2. A speech information processing method of generating a speech segment dictionary for holding a plurality of encoded speech segments, comprising:
-
a construction step of constructing quantization code books using speech segments stored in a speech database;
an encoding step of encoding the speech segments stored in the speech database using the quantization code books that were constructed using the speech segments stored in the speech database; and
a storage step of storing in he speech segment dictionary, the encoded speech segments that were encoded in said encoding step.
-
-
3. A speech information processing method of generating a speech segment dictionary for holding a plurality of speech segments, comprising:
-
a selection step of selecting an encoding method of encoding a speech segment from a plurality of encoding methods;
an encoding step of encoding the speech segment by using the selected encoding method; and
a storage step of storing the encoded speech segment in a speech segment dictionary, wherein the selected encoding method uses a μ
-law scheme, scalar quantization, and linear predictive coding.
-
-
4. A speech information processing apparatus for generating a speech segment dictionary for holding a plurality of speech segments, comprising:
-
selecting means for selecting an encoding method of encoding a speech segment from a plurality of encoding methods;
encoding means for encoding the speech segment by using the selected encoding method;
calculation means for calculating an encoding distortion produced by said encoding means;
selection means for selecting an encoding method of the plurality of encoding methods in which the encoding distortion is smallest; and
storage means for storing the encoded speech segment encoded using the encoding method selected by said selection means, in the speech segment dictionary, wherein the selected encoding method uses a iμ
-law scheme, scalar quantization, and linear predictive coding.
-
-
5. A speech information processing method of synthesizing speech by using a speech segment dictionary for holding a plurality of encoded speech segments, comprising:
-
a construction step of constructing quantization code books using speech segments stored in a speech database;
an encoding step of encoding the speech segments stored in the speech database using the quantization code books that were constructed using the speech segments stored in the speech database;
a storage step of storing in the speech segment dictionary, the encoded speech segments that were encoded in said encoding step; and
a decoding step of decoding the encoded speech segments by using the quantization code books constructed in said construction step.
-
-
6. A speech information processing method of synthesizing speech by using a speech segment dictionary for holding a plurality of speech segments, comprising:
-
a selection step of selecting an encoding method of encoding a speech segment from a plurality of encoding methods;
an encoding step of encoding the speech segment by using the selected encoding method; and
a storage step of storing the encoded speech segment in a speech segment dictionary, wherein the selected encoding method uses a μ
law scheme, scalar quantization, and linear predictive coding.
-
-
7. A speech information processing apparatus for synthesizing speech by using a speech segment dictionary for holding a plurality of speech segments, comprising:
-
decoding means for decoding the speech segment by using a decoding step of decoding the speech segment by using a plurality of decoding methods for decoding the speech segment;
calculation means for calculating a decoding distortion produced by said decoding means;
selection means for selecting a decoding method of the plurality of decoding methods in which the decoding distortion is smallest; and
speech synthesizing means for synthesizing speech on the basis of the decoded speech segment decoded by the decoding method selected by said selection means, wherein the selected decoding method uses a μ
-law scheme, scalar quantization, and linear predictive coding.
-
-
8. A speech information processing apparatus for generating a speech segment dictionary for holding a plurality of speech segments, comprising:
-
first encoding means for encoding a speech segment;
calculating means for calculating an encoding distortion produced by said first encoding means;
storage means for storing the encoded speech segment encoded by said first encoding means in the speech segment dictionary, in a case where the encoding distortion produced by said first encoding means is less than a predetermined value;
second encoding means for encoding the speech segment, in a case where the encoding distortion produced by said first encoding means is not less than the predetermined threshold value; and
storage means for storing the encoded speech segment encoded by said second encoding means in the speech segment dictionary.
-
-
9. A speech information processing apparatus for generating a speech segment dictionary for holding a plurality of encoded speech segments, comprising:
-
construction means for constructing quantization code books using one or more speech segments stored in a speech database;
encoding means for encoding the speech segments stored in the speech database using the quantization code books that were constructed using the speech segments stored in the speech database; and
storage means for storing in the speech segment dictionary, the encoded speech segments that were encoded by said encoding means.
-
-
10. A speech information processing apparatus for synthesizing speech by using a speech segment dictionary for holding a plurality of encoded speech segments, comprising:
-
construction means for constructing quantization code books using speech segments stored in a speech database;
encoding means for encoding the speech segments stored in the speech database using the quantization code books that were constructed using the speech segments stored in the speech database; and
storage means for storing in the speech segment dictionary, the encoded speech segments that were encoded by said encoding means; and
decoding means for decoding the encoded speech segments by using the quantization code books constructed by said construction means.
-
Specification