Voice processing device, voice processing method, and voice processing program
First Claim
Patent Images
1. A voice processing device comprising:
- a separation unit, implemented via a processor, configured to separate voice signals of a plurality of channels into an incoming component in each incoming direction;
a storage device configured to store a predetermined statistic and a voice recognition model for each incoming direction;
a dereverberation unit, implemented via the processor, configured to generate a dereverberation component where a reverberation component is suppressed based on Wiener Filtering from the incoming component separated by the separation unit;
a selection unit, implemented via the processor, configured to select a statistic corresponding to an incoming direction of the dereverberation component generated by the dereverberation unit;
an updating unit, implemented via the processor, configured to update the voice recognition model on the basis of the statistic selected by the selection unit; and
a voice recognition unit, implemented via the processor, configured to recognize a voice of the incoming component using the updated voice recognition model,wherein the dereverberation unit is configured to;
calculate a ratio of a squared value of a wavelet coefficient of the incoming component in a voiced section to a sum of the squared value of the wavelet coefficient of the incoming component in the voiced section and a squared value of a wavelet coefficient of the incoming component in a voiceless section, as a Wiener gain;
estimate the dereverberation component on the basis of a wavelet coefficient obtained by multiplying the wavelet coefficient of the incoming component in the voiced section by the Wiener gain; and
calculate the Wiener gain to reduce a difference between power of the estimated dereverberation component and power of the incoming component obtained by removing the incoming component in the voiceless section from the incoming component in the voiced section.
1 Assignment
0 Petitions
Accused Products
Abstract
A separation unit separates voice signals of a plurality of channels into an incoming component in each incoming direction, a selection unit selects a statistic corresponding to an incoming direction of the incoming component separated by the separation unit from a storage unit which stores a predetermined statistic and a voice recognition model for each incoming direction, an updating unit updates the voice recognition model on the basis of the statistic selected by the selection unit, and a voice recognition unit recognizes a voice of the incoming component separated using the voice recognition model.
29 Citations
6 Claims
-
1. A voice processing device comprising:
-
a separation unit, implemented via a processor, configured to separate voice signals of a plurality of channels into an incoming component in each incoming direction; a storage device configured to store a predetermined statistic and a voice recognition model for each incoming direction; a dereverberation unit, implemented via the processor, configured to generate a dereverberation component where a reverberation component is suppressed based on Wiener Filtering from the incoming component separated by the separation unit; a selection unit, implemented via the processor, configured to select a statistic corresponding to an incoming direction of the dereverberation component generated by the dereverberation unit; an updating unit, implemented via the processor, configured to update the voice recognition model on the basis of the statistic selected by the selection unit; and a voice recognition unit, implemented via the processor, configured to recognize a voice of the incoming component using the updated voice recognition model, wherein the dereverberation unit is configured to; calculate a ratio of a squared value of a wavelet coefficient of the incoming component in a voiced section to a sum of the squared value of the wavelet coefficient of the incoming component in the voiced section and a squared value of a wavelet coefficient of the incoming component in a voiceless section, as a Wiener gain; estimate the dereverberation component on the basis of a wavelet coefficient obtained by multiplying the wavelet coefficient of the incoming component in the voiced section by the Wiener gain; and calculate the Wiener gain to reduce a difference between power of the estimated dereverberation component and power of the incoming component obtained by removing the incoming component in the voiceless section from the incoming component in the voiced section. - View Dependent Claims (2, 3, 4)
-
-
5. A voice processing method in a voice processing device comprising:
-
a separation process, implemented via a processor, of separating voice signals of a plurality of channels into an incoming component in each incoming direction; a dereverberation process, implemented via the processor, of generating a dereverberation component where a reverberation component is suppressed based on Wiener Filtering from the incoming component separated by the separation process; a selection process, implemented via the processor, of selecting a statistic corresponding to an incoming direction of the dereverberation component generated by the dereverberation process; a storage process, implemented via a storage device, of storing a predetermined statistic and a voice recognition model for each incoming direction; an updating process, implemented via the processor, of updating the voice recognition model on the basis of the statistic selected in the selection process; and a voice recognition process, implemented via the processor, of recognizing a voice of the incoming component using the updated voice recognition model, wherein the dereverberation process includes; calculating a ratio of a squared value of a wavelet coefficient of the incoming component in a voiced section to a sum of the squared value of the wavelet coefficient of the incoming component in the voiced section and, a squared value of a wavelet coefficient of the incoming component in a voiceless section, as a Wiener gain; estimating the dereverberation component on the basis of a wavelet coefficient obtained by multiplying the wavelet coefficient of the incoming component in the voiced section by the Wiener gain; and calculating the Wiener gain to reduce a difference between the power of the estimated dereverberation component and power of the incoming component obtained by removing the incoming component in the voiceless section from the incoming component in the voiced section.
-
-
6. A non-transitory computer-readable storage medium storing a voice processing program which causes a computer to execute a process, the process comprising:
-
a separation process of separating voice signals of a plurality of channels into an incoming component in each incoming direction; a dereverberation process of generating a dereverberation component where a reverberation component is suppressed based on Wiener Filtering from the incoming component separated by the separation unit; a selection process of selecting a statistic corresponding to an incoming direction of the dereverberation component generated by the dereverberation unit; a storage process of storing a predetermined statistic and a voice recognition model for each incoming direction; an updating process of updating the voice recognition model on the basis of the statistic selected in the selection process; and a voice recognition process of recognizing a voice of the incoming component using the updated voice recognition model, wherein the dereverberation process includes; calculating a ratio of a squared value of a wavelet coefficient of the incoming component in a voiced section to a sum of the squared value of the wavelet coefficient of the incoming component in the voiced section and a squared value of a wavelet coefficient of the incoming component in a voiceless section, as a Wiener gain; estimating the dereverberation component on the basis of a wavelet coefficient obtained by multiplying the wavelet coefficient of the incoming component in the voiced section by the Wiener gain; and calculating the Wiener gain to reduce a difference between power of the estimated dereverberation component and power of the incoming component obtained by removing the incoming component in the voiceless section from the incoming component in the voiced section.
-
Specification