SPEECH SEPARATING APPARATUS, SPEECH SYNTHESIZING APPARATUS, AND VOICE QUALITY CONVERSION APPARATUS
First Claim
1. A speech separating apparatus that separates an input speech signal into vocal tract information and voicing source information, said speech separating apparatus comprising:
- a vocal tract information extracting unit configured to extract vocal tract information from the input speech signal;
a filter smoothing unit configured to smooth, in a first time constant, the vocal tract information extracted by said vocal tract information extracting unit;
an inverse filtering unit configured to calculate a filter having an inverse characteristic to a frequency response of the vocal tract information smoothed by said filter smoothing unit, and to filter the input speech signal by using the calculated filter; and
a voicing source modeling unit configured to take, from the input speech signal filtered by said inverse filtering unit, a waveform included in a second time constant shorter than the first time constant, and to calculate, for each waveform that is taken, voicing source information from the each waveform.
3 Assignments
0 Petitions
Accused Products
Abstract
A speech separating apparatus includes: a PARCOR calculating unit (102) that extracts vocal tract information from an input speech signal; a filter smoothing unit (103) that smoothes, in a first time constant, the vocal tract information extracted by the PARCOR calculating unit (102); an inverse filtering unit (104) that calculates a filter coefficient of a filter having a frequency amplitude response characteristic inverse to the vocal tract information smoothed by the filter smoothing unit (103), so as to filter the input speech signal using the filter having the calculated filter coefficient; and a voicing source modeling unit (105) that cuts out, from the input speech signal filtered by the inverse filtering unit (104), a waveform included in a second time constant shorter than the first time constant, so as to calculate, for each waveform that is taken, voicing source information from the each waveform.
-
Citations
18 Claims
-
1. A speech separating apparatus that separates an input speech signal into vocal tract information and voicing source information, said speech separating apparatus comprising:
-
a vocal tract information extracting unit configured to extract vocal tract information from the input speech signal; a filter smoothing unit configured to smooth, in a first time constant, the vocal tract information extracted by said vocal tract information extracting unit; an inverse filtering unit configured to calculate a filter having an inverse characteristic to a frequency response of the vocal tract information smoothed by said filter smoothing unit, and to filter the input speech signal by using the calculated filter; and a voicing source modeling unit configured to take, from the input speech signal filtered by said inverse filtering unit, a waveform included in a second time constant shorter than the first time constant, and to calculate, for each waveform that is taken, voicing source information from the each waveform. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A speech synthesizing apparatus that generates synthesized speech by using vocal tract information and voicing source information included in an input speech signal, said speech synthesizing apparatus comprising:
-
a vocal tract information extracting unit configured to extract vocal tract information from the input speech signal; a filter smoothing unit configured to smooth, in a first time constant, the vocal tract information extracted by said vocal tract information extracting unit; an inverse filtering unit configured to calculate a filter having an inverse characteristic to a frequency response of the vocal tract information smoothed by said filter smoothing unit, and to filter the input speech signal by using the calculated filter; a voicing source modeling unit configured to take, from the input speech signal filtered by said inverse filtering unit, a waveform included in a second time constant shorter than the first time constant, and to calculate, for each waveform that is taken, parameterized voicing source information from the each waveform; and a synthesis unit configured to generate synthesized speech by generating a voicing source waveform by using a voicing source information parameter outputted from said voicing source modeling unit, and filtering the generated voicing source waveform by using the vocal tract information smoothed by said filter smoothing unit. - View Dependent Claims (12, 13, 14)
-
-
15. A voice quality conversion apparatus that converts a voice quality of an input speech signal, said voice quality conversion apparatus comprising:
-
a vocal tract information extracting unit configured to extract vocal tract information from the input speech signal; a filter smoothing unit configured to smooth, in a first time constant, the vocal tract information extracted by said vocal tract information extracting unit; an inverse filtering unit configured to calculate a filter having an inverse characteristic to a frequency response of the vocal tract information smoothed by said filter smoothing unit, and to filter the input speech signal by using the calculated filter; a voicing source modeling unit configured to take, from the input speech signal filtered by said inverse filtering unit, a waveform included in a second time constant shorter than the first time constant, and to calculate, for each waveform that is taken, parameterized voicing source information from the each waveform; a target speech information holding unit configured to hold vocal tract information and the parameterized voicing source information on a target voice quality; a conversion ratio input unit configured to input a conversion ratio for converting the input speech signal into the target voice quality; a filter transformation unit configured to convert, at the conversion ratio inputted by said conversion ratio input unit, the vocal tract information smoothed by said filter smoothing unit into the vocal tract information on the target voice quality, which is held by said target speech information holding unit; a voicing source transformation unit configured to convert, at the conversion ratio inputted by said conversion ratio input unit, the voicing source information parameterized by said voicing source modeling unit into the voicing source information on the target voice quality, which is held by said target speech information holding unit; and a synthesis unit configured to generate synthesized speech by generating a voicing source waveform by using the parameterized voicing source information transformed by said voicing source transformation unit, and filtering the generated voicing source waveform by using the vocal tract information transformed by said filter transformation unit. - View Dependent Claims (16)
-
-
17. A method of separating an input speech signal into vocal tract information and voicing source information, said method comprising:
-
extracting vocal tract information from the input speech signal; smoothing, in a first time constant, the vocal tract information extracted in said extracting; calculating a filter having an inverse characteristic to a frequency response of the vocal tract information smoothed in said smoothing, and filtering the input speech signal by using the calculated filter; and taking, from the input speech signal filtered in said calculating, a waveform included in a second time constant shorter than the first time constant, and calculating, for each waveform that is taken, voicing source information from the each waveform.
-
-
18. A program for separating an input speech signal into vocal tract information and voicing source information, said program causing a computer to execute:
-
extracting vocal tract information from the input speech signal; smoothing, in a first time constant, the vocal tract information extracted in the extracting; calculating a filter having an inverse characteristic to a frequency response of the vocal tract information smoothed in the smoothing, and filtering the input speech signal by using the calculated filter; and taking, from the input speech signal filtered in the calculating, a waveform included in a second time constant shorter than the first time constant, and calculating, for each waveform that is taken, voicing source information from the each waveform.
-
Specification