Voice processing device, voice processing method, and voice processing program

US 10,283,115 B2
Filed: 06/15/2017
Issued: 05/07/2019
Est. Priority Date: 08/25/2016
Status: Active Grant

First Claim

Patent Images

1. A voice processing device comprising:

a separation unit, implemented via a processor, configured to separate voice signals of a plurality of channels into an incoming component in each incoming direction;

a storage device configured to store a predetermined statistic and a voice recognition model for each incoming direction;

a dereverberation unit, implemented via the processor, configured to generate a dereverberation component where a reverberation component is suppressed based on Wiener Filtering from the incoming component separated by the separation unit;

a selection unit, implemented via the processor, configured to select a statistic corresponding to an incoming direction of the dereverberation component generated by the dereverberation unit;

an updating unit, implemented via the processor, configured to update the voice recognition model on the basis of the statistic selected by the selection unit; and

a voice recognition unit, implemented via the processor, configured to recognize a voice of the incoming component using the updated voice recognition model,wherein the dereverberation unit is configured to;

calculate a ratio of a squared value of a wavelet coefficient of the incoming component in a voiced section to a sum of the squared value of the wavelet coefficient of the incoming component in the voiced section and a squared value of a wavelet coefficient of the incoming component in a voiceless section, as a Wiener gain;

estimate the dereverberation component on the basis of a wavelet coefficient obtained by multiplying the wavelet coefficient of the incoming component in the voiced section by the Wiener gain; and

calculate the Wiener gain to reduce a difference between power of the estimated dereverberation component and power of the incoming component obtained by removing the incoming component in the voiceless section from the incoming component in the voiced section.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A separation unit separates voice signals of a plurality of channels into an incoming component in each incoming direction, a selection unit selects a statistic corresponding to an incoming direction of the incoming component separated by the separation unit from a storage unit which stores a predetermined statistic and a voice recognition model for each incoming direction, an updating unit updates the voice recognition model on the basis of the statistic selected by the selection unit, and a voice recognition unit recognizes a voice of the incoming component separated using the voice recognition model.

29 Citations

View as Search Results

6 Claims

1. A voice processing device comprising:
- a separation unit, implemented via a processor, configured to separate voice signals of a plurality of channels into an incoming component in each incoming direction;
  
  a storage device configured to store a predetermined statistic and a voice recognition model for each incoming direction;
  
  a dereverberation unit, implemented via the processor, configured to generate a dereverberation component where a reverberation component is suppressed based on Wiener Filtering from the incoming component separated by the separation unit;
  
  a selection unit, implemented via the processor, configured to select a statistic corresponding to an incoming direction of the dereverberation component generated by the dereverberation unit;
  
  an updating unit, implemented via the processor, configured to update the voice recognition model on the basis of the statistic selected by the selection unit; and
  
  a voice recognition unit, implemented via the processor, configured to recognize a voice of the incoming component using the updated voice recognition model,wherein the dereverberation unit is configured to;
  
  calculate a ratio of a squared value of a wavelet coefficient of the incoming component in a voiced section to a sum of the squared value of the wavelet coefficient of the incoming component in the voiced section and a squared value of a wavelet coefficient of the incoming component in a voiceless section, as a Wiener gain;
  
  estimate the dereverberation component on the basis of a wavelet coefficient obtained by multiplying the wavelet coefficient of the incoming component in the voiced section by the Wiener gain; and
  
  calculate the Wiener gain to reduce a difference between power of the estimated dereverberation component and power of the incoming component obtained by removing the incoming component in the voiceless section from the incoming component in the voiced section.
- View Dependent Claims (2, 3, 4)
- - 2. The voice processing device according to claim 1, wherein the statistic is the same type of parameter as at least some parameters of the voice recognition model, and the voice processing device further comprises a generation unit, implemented via the processor, configured to store in the storage device a statistic calculated so that likelihood for the incoming component increases.
  - 3. The voice processing device according to claim 2, wherein the voice recognition model is a model which represents an output probability of an acoustic feature amount as a linear combination of a plurality of Gaussian functions, the statistic is a mixture weight, a mean, and variance of a Gaussian function, and the updating unit updates a mean and variance of a Gaussian function of the voice recognition model to increase likelihood for the incoming component.
  - 4. The voice processing device according to claim 1, wherein the separation unit separates a direct sound component from a sound source from a reflected sound component as the incoming component, and the voice recognition unit recognizes a voice of the direct sound component.

5. A voice processing method in a voice processing device comprising:
- a separation process, implemented via a processor, of separating voice signals of a plurality of channels into an incoming component in each incoming direction;
  
  a dereverberation process, implemented via the processor, of generating a dereverberation component where a reverberation component is suppressed based on Wiener Filtering from the incoming component separated by the separation process;
  
  a selection process, implemented via the processor, of selecting a statistic corresponding to an incoming direction of the dereverberation component generated by the dereverberation process;
  
  a storage process, implemented via a storage device, of storing a predetermined statistic and a voice recognition model for each incoming direction;
  
  an updating process, implemented via the processor, of updating the voice recognition model on the basis of the statistic selected in the selection process; and
  
  a voice recognition process, implemented via the processor, of recognizing a voice of the incoming component using the updated voice recognition model,wherein the dereverberation process includes;
  
  calculating a ratio of a squared value of a wavelet coefficient of the incoming component in a voiced section to a sum of the squared value of the wavelet coefficient of the incoming component in the voiced section and, a squared value of a wavelet coefficient of the incoming component in a voiceless section, as a Wiener gain;
  
  estimating the dereverberation component on the basis of a wavelet coefficient obtained by multiplying the wavelet coefficient of the incoming component in the voiced section by the Wiener gain; and
  
  calculating the Wiener gain to reduce a difference between the power of the estimated dereverberation component and power of the incoming component obtained by removing the incoming component in the voiceless section from the incoming component in the voiced section.

6. A non-transitory computer-readable storage medium storing a voice processing program which causes a computer to execute a process, the process comprising:
- a separation process of separating voice signals of a plurality of channels into an incoming component in each incoming direction;
  
  a dereverberation process of generating a dereverberation component where a reverberation component is suppressed based on Wiener Filtering from the incoming component separated by the separation unit;
  
  a selection process of selecting a statistic corresponding to an incoming direction of the dereverberation component generated by the dereverberation unit;
  
  a storage process of storing a predetermined statistic and a voice recognition model for each incoming direction;
  
  an updating process of updating the voice recognition model on the basis of the statistic selected in the selection process; and
  
  a voice recognition process of recognizing a voice of the incoming component using the updated voice recognition model,wherein the dereverberation process includes;
  
  calculating a ratio of a squared value of a wavelet coefficient of the incoming component in a voiced section to a sum of the squared value of the wavelet coefficient of the incoming component in the voiced section and a squared value of a wavelet coefficient of the incoming component in a voiceless section, as a Wiener gain;
  
  estimating the dereverberation component on the basis of a wavelet coefficient obtained by multiplying the wavelet coefficient of the incoming component in the voiced section by the Wiener gain; and
  
  calculating the Wiener gain to reduce a difference between power of the estimated dereverberation component and power of the incoming component obtained by removing the incoming component in the voiceless section from the incoming component in the voiced section.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Honda Motor Co., Ltd. (Honda Motor Company)
Original Assignee
Honda Motor Co., Ltd. (Honda Motor Company)
Inventors
Gomez, Randy, Nakadai, Kazuhiro
Primary Examiner(s)
Desir, Pierre Louis
Assistant Examiner(s)
Kim, Jonathan C

Application Number

US15/623,807
Publication Number

US 20180061398A1
Time in Patent Office

691 Days
Field of Search

None
US Class Current
CPC Class Codes

G01S 3/8006   Multi-channel systems speci...

G10L 15/063   Training

G10L 15/20   Speech recognition techniqu...

G10L 15/22   Procedures used during a sp...

G10L 2015/0635   updating or merging of old ...

G10L 2021/02082   the noise being echo, rever...

G10L 21/0232   Processing in the frequency...

G10L 21/0272   Voice signal separating

G10L 21/028   using properties of sound s...

Voice processing device, voice processing method, and voice processing program

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

29 Citations

6 Claims

Specification

Solutions

Use Cases

Quick Links

Voice processing device, voice processing method, and voice processing program

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

29 Citations

6 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links