Speech enhancement apparatus, speech recording apparatus, speech enhancement program, speech recording program, speech enhancing method, and speech recording method
First Claim
1. A speech enhancement apparatus that corrects and outputs unclear portions of input speech data, the speech enhancement apparatus comprising:
- a waveform-feature-quantity calculating unit that calculates a waveform feature quantity of the speech data for each phoneme, the speech data being input along with phoneme boundary data that splits the speech data into phonemes;
a correction determining unit that determines a necessity of correction of the speech data for each phoneme, based on the waveform feature quantity calculated by the waveform-feature-quantity calculating unit; and
a waveform correcting unit that corrects the speech data, the necessity of correction thereof is determined by the correction determining unit, for each phoneme by using waveform data that is prior stored in a phonemewise-waveform-data storage unit.
1 Assignment
0 Petitions
Accused Products
Abstract
To automatically detect and automatically correct in a reproduced speech, defective portions related to plosives such as existence or absence of plosive portions, phoneme lengths of aspirated portions that continue after the plosive portions or defective portions related to amplitude variations of fricatives. Speech wherein consonants and unvoiced vowels are unclear and discordant is input into a speech enhancement apparatus according to the present invention. In the speech enhancement apparatus, the speech is split into phonemes and each phoneme is classified into any one of an unvoiced plosive, a voiced plosive, an unvoiced fricative, a voiced fricative, an affricate, and an unvoiced vowel. Each phoneme is corrected according to a determination of necessity of correction of each phoneme to obtain an output of the speech wherein the consonants and the unvoiced vowels are clear and not discordant.
-
Citations
28 Claims
-
1. A speech enhancement apparatus that corrects and outputs unclear portions of input speech data, the speech enhancement apparatus comprising:
-
a waveform-feature-quantity calculating unit that calculates a waveform feature quantity of the speech data for each phoneme, the speech data being input along with phoneme boundary data that splits the speech data into phonemes; a correction determining unit that determines a necessity of correction of the speech data for each phoneme, based on the waveform feature quantity calculated by the waveform-feature-quantity calculating unit; and a waveform correcting unit that corrects the speech data, the necessity of correction thereof is determined by the correction determining unit, for each phoneme by using waveform data that is prior stored in a phonemewise-waveform-data storage unit. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A speech recording apparatus that records input speech data in a phonemewise-waveform-data storage unit, the speech recording apparatus comprising:
-
a phoneme-identification-data output unit that assigns phoneme identification data to the speech data, based on the input speech data and a phoneme string that is output by carrying out a language process on text data of the speech data, determines boundaries of the phoneme identification data, and outputs boundary data of the phoneme identification data as the phoneme boundary data; a waveform-feature-quantity calculating unit that calculates a waveform feature quantity of the speech data for each phoneme, the speech data being input along with the boundary data of the phoneme identification data output by the phoneme-identification-data output unit; a condition sufficiency determining unit that determines whether the speech data satisfies predetermined conditions for each phoneme, based on the waveform feature quantity calculated by the waveform-feature-quantity calculating unit; and a phonemewise-waveform-data recording unit that records in the phonemewise-waveform-data storage unit, the speech data of each phoneme that is determined to be satisfied the predetermined conditions, based on a determination by the condition sufficiency determining unit.
-
-
14. A computer-readable recording medium that stores therein a speech enhancing program that causes a computer to correct and output unclear portions of input speech data, the speech enhancing program causing the computer to execute:
-
calculating a waveform feature quantity of the speech data for each phoneme, the speech data being input along with phoneme boundary data that splits the speech data into phonemes; determining a necessity of correction of the speech data for each phoneme, based on the waveform feature quantity calculated in calculating the waveform-feature-quantity; and correcting the speech data, the necessity of correction thereof is determined in the determining, for each phoneme by using waveform data that is prior stored in a phonemewise-waveform-data storage unit. - View Dependent Claims (15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25)
-
-
26. A computer-readable recording medium that stores therein a speech recording program that causes a computer to record input speech data in a phonemewise-waveform-data storage unit, the speech recording program causing the computer to execute:
-
assigning phoneme identification data to the speech data, based on the input speech data and a phoneme string that is output by carrying out a language process on text data of the speech data, determining boundaries of the phoneme identification data, and outputting boundary data of the phoneme identification data as the phoneme boundary data; calculating a waveform feature quantity of the speech data for each phoneme, the speech data being input along with the boundary data of the phoneme identification data output from the outputting; determining whether the speech data satisfies predetermined conditions for each phoneme, based on the waveform feature quantity calculated in calculating; and recording in the phonemewise-waveform-data storage unit, the speech data of each phoneme that is determined to be satisfied the predetermined conditions, based on a determination in determining.
-
-
27. A speech enhancing method that corrects and outputs unclear portions of input speech data, the speech enhancing method comprising:
-
calculating a waveform feature quantity of the speech data for each phoneme, the speech data being input along with phoneme boundary data that splits the speech data into phonemes; determining a necessity of correction of the speech data for each phoneme, based on the waveform feature quantity calculated in calculating; and correcting the speech data, the necessity of correction thereof is determined in determining, for each phoneme by using waveform data that is prior stored in a phonemewise-waveform-data storage unit.
-
-
28. A speech recording method that corrects and outputs unclear portions of input speech data, the speech recording method comprising:
-
assigning phoneme identification data to the speech data, based on the input speech data and a phoneme string that is output by carrying out a language process on text data of the speech data, determining boundaries of the phoneme identification data, and outputting boundary data of the phoneme identification data as the phoneme boundary data; calculating a waveform feature quantity of the speech data for each phoneme, the speech data being input along with the boundary data of the phoneme identification data output from the outputting; determining whether the speech data satisfies predetermined conditions for each phoneme, based on the waveform feature quantity calculated in calculating; and recording in the phonemewise-waveform-data storage unit, the speech data of each phoneme that is determined to be satisfied the predetermined conditions, based on a determination in the determining.
-
Specification