SPEECH ANALYSIS DEVICE, SPEECH ANALYSIS AND SYNTHESIS DEVICE, CORRECTION RULE INFORMATION GENERATION DEVICE, SPEECH ANALYSIS SYSTEM, SPEECH ANALYSIS METHOD, CORRECTION RULE INFORMATION GENERATION METHOD, AND PROGRAM
First Claim
1. A speech analysis device which analyzes an aperiodic component included in speech from an input signal representing a mixed sound of background noise and the speech, said speech analysis device comprising:
- a frequency band division unit configured to divide the input signal into bandpass signals each associated with a corresponding one of frequency bands;
a noise interval identification unit configured to identify a noise interval in which the input signal represents only the background noise and a speech interval in which the input signal represents the background noise and the speech;
an SNR calculation unit configured to calculate an SN ratio which is a ratio between power of each of the bandpass signals divided from the input signal in the speech interval and power of each of the bandpass signals divided from the input signal in the noise interval;
a correlation function calculation unit configured to calculate an autocorrelation function of each of the bandpass signals divided from the input signal in the speech interval;
a correction amount determination unit configured to determine a correction amount for an aperiodic component ratio, based on the calculated SN ratio; and
an aperiodic component ratio calculation unit configured to calculate, for each of the frequency bands, an aperiodic component ratio of the aperiodic component included in the speech, based on the determined correction amount and the calculated autocorrelation function.
1 Assignment
0 Petitions
Accused Products
Abstract
A speech analysis device which accurately analyzes an aperiodic component included in speech in a practical environment where there is background noise includes: a frequency band division unit which divides, into bandpass signals each associated with a corresponding one of frequency bands, an input signal representing a mixed sound of background noise and speech; a noise interval identification unit which identifies a noise interval and a speech interval of the input signal; an SNR calculation unit which calculates an SN ratio; a correlation function calculation unit which calculates an autocorrelation function of each bandpass signal; a correction amount determination unit which determines a correction amount for an aperiodic component ratio, based on the calculated SN ratio; and an aperiodic component ratio calculation unit which calculates, for each frequency band, an aperiodic component ratio of the aperiodic component, based on the determined correction amount and the calculated autocorrelation function.
-
Citations
15 Claims
-
1. A speech analysis device which analyzes an aperiodic component included in speech from an input signal representing a mixed sound of background noise and the speech, said speech analysis device comprising:
-
a frequency band division unit configured to divide the input signal into bandpass signals each associated with a corresponding one of frequency bands; a noise interval identification unit configured to identify a noise interval in which the input signal represents only the background noise and a speech interval in which the input signal represents the background noise and the speech; an SNR calculation unit configured to calculate an SN ratio which is a ratio between power of each of the bandpass signals divided from the input signal in the speech interval and power of each of the bandpass signals divided from the input signal in the noise interval; a correlation function calculation unit configured to calculate an autocorrelation function of each of the bandpass signals divided from the input signal in the speech interval; a correction amount determination unit configured to determine a correction amount for an aperiodic component ratio, based on the calculated SN ratio; and an aperiodic component ratio calculation unit configured to calculate, for each of the frequency bands, an aperiodic component ratio of the aperiodic component included in the speech, based on the determined correction amount and the calculated autocorrelation function. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A speech analysis and synthesis device which analyzes an aperiodic component included in first speech from a first input signal representing a mixed sound of background noise and the first speech, and synthesizes the analyzed aperiodic component into second speech represented by a second input signal, said speech analysis and synthesis device comprising:
-
a frequency band division unit configured to divide the first input signal into bandpass signals each associated with a corresponding one of frequency bands; a noise interval identification unit configured to identify a noise interval in which the first input signal represents only the background noise and a speech interval in which the first input signal represents the background noise and the first speech; an SNR calculation unit configured to calculate an SN ratio which is a ratio between power of each of the bandpass signals divided from the first input signal in the speech interval and power of each of the bandpass signals divided from the first input signal in the noise interval; a correlation function calculation unit configured to calculate an autocorrelation function of each of the bandpass signals divided from the first input signal in the speech interval; a correction amount determination unit configured to determine a correction amount for an aperiodic component ratio, based on the calculated SN ratio; an aperiodic component ratio calculation unit configured to calculate, for each of the frequency bands, an aperiodic component ratio of the aperiodic component included in the first speech, based on the determined correction amount and the calculated autocorrelation function; an aperiodic component spectrum calculation unit configured to calculate an aperiodic component spectrum indicating a frequency distribution of the aperiodic component, based on the aperiodic component ratio calculated for each of the frequency bands; a vocal tract characteristics analysis unit configured to analyze vocal tract characteristics for the second speech; an inverse filtering unit configured to extract a voicing-source waveform of the second speech by performing inverse filtering on the second speech using characteristics inverse to the analyzed vocal tract characteristics; a voicing-source modeling unit configured to model the extracted voicing-source waveform; and a synthesis unit configured to synthesize speech based on the analyzed vocal tract characteristics, the modeled voicing-source characteristics, and the calculated aperiodic component spectrum.
-
-
10. A correction rule information generation device comprising:
-
a frequency band division unit configured to divide, into same bandpass signals each associated with a corresponding one of divided bands, an input signal representing speech and an other input signal representing noise, respectively, the divided bands being frequency bands; an SNR calculation unit configured to calculate, for each of the divided bands, an SN ratio which is a ratio between power of the speech and power of the noise in each of different time intervals, based on each of the bandpass signals obtained through the division; a correlation function calculation unit configured to calculate, for each of the divided bands, an autocorrelation value of the speech and an autocorrelation value of the speech in each of the different time intervals, based on each of the bandpass signals obtained through the division; and a correction rule information generating unit configured to generate, for each of the divided bands, correction rule information, based on the calculated SN ratio, the autocorrelation value of the speech, and the autocorrelation value of the noise, the correction rule information indicating a correspondence of a difference between the autocorrelation value of the speech and the autocorrelation value of the noise to the SN ratio.
-
-
11. A speech analysis system comprising:
-
a speech analysis device which analyzes an aperiodic component included in speech from an input signal representing a mixed sound of background noise and the speech; and a correction rule information generating device, wherein said speech analysis device includes; a frequency band division unit configured to divide the input signal into bandpass signals each associated with a corresponding one of frequency bands; a noise interval identification unit configured to identify a noise interval in which the input signal represents only the background noise and a speech interval in which the input signal represents the background noise and the speech; an SNR calculation unit configured to calculate an SN ratio which is a ratio between power of each of the bandpass signals divided from the input signal in the speech interval and power of each of the bandpass signals divided from the input signal in the noise interval; a correlation function calculation unit configured to calculate an autocorrelation function of each of the bandpass signals divided from the input signal in the speech interval; a correction amount determination unit configured to determine a correction amount for an aperiodic component ratio, based on the calculated SN ratio; and an aperiodic component ratio calculation unit configured to calculate, for each of the frequency bands, an aperiodic component ratio of the aperiodic component included in the first speech, based on the determined correction amount and the calculated autocorrelation function, said correction rule information generating device includes; a frequency band division unit configured to divide, into same bandpass signals each associated with a corresponding one of divided bands, an input signal representing speech and an other input signal representing noise, respectively, the divided bands being frequency bands; an SNR calculation unit configured to calculate, for each of the divided bands, an SN ratio which is a ratio between power of the speech and power of the noise in each of different time intervals, based on each of the bandpass signals obtained through the division; a correlation function calculation unit configured to calculate, for each of the divided bands, an autocorrelation value of the speech and an autocorrelation value of the speech in each of the different time intervals, based on each of the bandpass signals obtained through the division; and a correction rule information generating unit configured to generate, for each of the divided bands, correction rule information, based on the calculated SN ratio, the autocorrelation value of the speech, and the autocorrelation value of the noise, the correction rule information indicating a correspondence of a difference between the autocorrelation value of the speech and the autocorrelation value of the noise to the SN ratio, and said speech analysis device refers to a correction amount corresponding to the calculated SN ratio according to the correction rule information generated by said correction rule information generating device, and determine the correction amount referred to as the correction amount for the aperidoic component ratio.
-
-
12. A speech analysis method of analyzing an aperiodic component included in speech from an input signal representing a mixed sound of background noise and the speech, said speech analysis method comprising:
-
dividing the input signal into bandpass signals each associated with a corresponding one of frequency bands; identifying a noise interval in which the input signal represents only the background noise and a speech interval in which the input signal represents the background noise and the speech; calculating an SN ratio which is a ratio between power of each of the bandpass signals divided from the input signal in the speech interval and power of each of the bandpass signals divided from the input signal in the noise interval; calculating an autocorrelation function of each of the bandpass signals divided from the first input signal in the speech interval; determining a correction amount for an aperiodic component ratio, based on the calculated SN ratio; and calculating, for each of the frequency bands, an aperiodic component ratio of the aperiodic component included in the speech, based on the determined correction amount and the calculated autocorrelation function.
-
-
13. A correction rule information generating method comprising:
-
dividing, into same bandpass signals each associated with a corresponding one of divided bands, an input signal representing speech and an other input signal representing noise, respectively, the divided bands being frequency bands; calculating, for each of the divided bands, an SN ratio which is a ratio between power of the speech and power of the noise in each of different time intervals, based on each of the bandpass signals obtained in said dividing; a correlation function calculation unit configured to calculate, for each of the divided bands, an autocorrelation value of the speech and an autocorrelation value of the speech in each of the different time intervals, based on each of the bandpass signals obtained in said dividing; and generating, for each of the divided bands, correction rule information, based on the calculated SN ratio, the autocorrelation value of the speech, and the autocorrelation value of the noise, the correction rule information indicating a correspondence of a difference between the autocorrelation value of the speech and the autocorrelation value of the noise to the SN ratio.
-
-
14. A computer-executable program for analyzing an aperiodic component included in speech from an input signal representing a mixed sound of background noise and the speech, said computer-executable program causing a computer to execute:
-
dividing the input signal into bandpass signals each associated with a corresponding one of frequency bands; identifying a noise interval in which the input signal represents only the background noise and a speech interval in which the input signal represents the background noise and the speech; calculating an SN ratio which is a ratio between power of each of the bandpass signals divided from the input signal in the speech interval and power of each of the bandpass signals divided from the input signal in the noise interval; calculating an autocorrelation function of each of the bandpass signals divided from the first input signal in the speech interval; determining a correction amount for an aperiodic component ratio, based on the calculated SN ratio; and calculating, for each of the frequency bands, an aperiodic component ratio of the aperiodic component included in the speech, based on the determined correction amount and the calculated autocorrelation function.
-
-
15. A program recorded on a computer-readable medium, said program causing a computer to execute:
-
dividing, into same bandpass signals each associated with a corresponding one of divided bands, an input signal representing speech and an other input signal representing noise, respectively, the divided bands being frequency bands; calculating, for each of the divided bands, an SN ratio which is a ratio between power of the speech and power of the noise in each of different time intervals, based on each of the bandpass signals obtained in said dividing; calculating, for each of the divided bands, an autocorrelation value of the speech and an autocorrelation value of the speech in each of the different time intervals, based on each of the bandpass signals obtained through the division; and generating, for each of the divided bands, correction rule information, based on the calculated SN ratio, the autocorrelation value of the speech, and the autocorrelation value of the noise, the correction rule information indicating a correspondence of a difference between the autocorrelation value of the speech and the autocorrelation value of the noise to the SN ratio.
-
Specification