Method and system of estimating clean speech parameters from noisy speech parameters

US 10,319,377 B2
Filed: 02/28/2017
Issued: 06/11/2019
Est. Priority Date: 03/15/2016
Status: Active Grant

First Claim

Patent Images

1. A method of estimating clean speech parameters from noisy speech parameters, said method comprising processor implemented steps of:

acquiring speech signals using a speech acquisition module (202);

estimating noise from said acquired speech signals using a noise estimation module (204), wherein estimation of noise is performed through non-speech frames of the acquired speech signals;

computing noisy speech features from said estimated noise using a feature extraction module (206), wherein Mel-Frequency Cepstral Coefficients are used as said noisy speech features;

estimating noisy model parameters from said computed noisy speech features using a parameter estimation module (208); and

estimating clean parameters from said estimated noise and said estimated noisy model parameters using a clean parameter estimation module (210), wherein the estimation is performed using Reverse Psychoacoustic Compensation (RPC),wherein the RPC is performed iteratively by;

initializing a clean model mean to a certain value of a model mean;

performing Psychoacoustic Compensation on the initialized clean model mean to obtain a compensated clean model mean;

determining if value of the compensated clean model mean is within a certain range of a model mean value;

terminating the iteration in response to determining that the value of the compensated clean model mean is within a certain range of the model mean value; and

updating clean parameters of the clean model mean in response to determining that the value of the compensated clean model mean is outside a certain range of the model mean value.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method and system is provided for estimating clean speech parameters from noisy speech parameters. The method is performed by acquiring speech signals, estimating noise from the acquired speech signals, computing speech features from the acquired speech signals, estimating model parameters from the computed speech features and estimating clean parameters from the estimated noise and the estimated model parameters.

Citations

13 Claims

1. A method of estimating clean speech parameters from noisy speech parameters, said method comprising processor implemented steps of:
- acquiring speech signals using a speech acquisition module (202);
  
  estimating noise from said acquired speech signals using a noise estimation module (204), wherein estimation of noise is performed through non-speech frames of the acquired speech signals;
  
  computing noisy speech features from said estimated noise using a feature extraction module (206), wherein Mel-Frequency Cepstral Coefficients are used as said noisy speech features;
  
  estimating noisy model parameters from said computed noisy speech features using a parameter estimation module (208); and
  
  estimating clean parameters from said estimated noise and said estimated noisy model parameters using a clean parameter estimation module (210), wherein the estimation is performed using Reverse Psychoacoustic Compensation (RPC),wherein the RPC is performed iteratively by;
  
  initializing a clean model mean to a certain value of a model mean;
  
  performing Psychoacoustic Compensation on the initialized clean model mean to obtain a compensated clean model mean;
  
  determining if value of the compensated clean model mean is within a certain range of a model mean value;
  
  terminating the iteration in response to determining that the value of the compensated clean model mean is within a certain range of the model mean value; and
  
  updating clean parameters of the clean model mean in response to determining that the value of the compensated clean model mean is outside a certain range of the model mean value.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The method as claimed in claim 1, wherein said speech acquisition module (202) further converts said acquired speech signals from analog to digital waveforms.
  - 3. The method as claimed in claim 1, wherein said estimation of noise using the noise estimation module is performed during training phase.
  - 4. The method as claimed in claim 1, wherein said estimated noise and said estimated noisy model parameters are first converted into their spectral domain representations.
  - 5. The method as claimed in claim 1, wherein the estimated clean parameters are in their spectral domain representation.
  - 6. The method as claimed in claim 5, wherein the estimated clean parameters are converted from their spectral domain representation to feature domain representation.

7. A system of estimating clean speech parameters from noisy speech parameters, said system comprising:
- a processor;
  
  a data bus coupled to said processor; and
  
  a computer-usable medium embodying computer code, said computer-usable medium being coupled to said data bus, said computer program code comprising instructions executable by said processor and configured for;
  
  acquiring speech signals;
  
  estimating noise from said acquired speech signals, wherein estimation of noise can be performed through non-speech frames of the acquired speech signals;
  
  computing noisy speech features from said estimated noise, wherein Mel-Frequency Cepstral Coefficients are used as said noisy speech features;
  
  estimating noisy model parameters from said computed noisy speech features;
  
  estimating clean parameters from said estimated noise and said estimated noisy model parameters, wherein the estimation is performed using reverse psychoacoustic compensation (RPC), wherein the RPC is performed iteratively by;
  
  initializing a clean model mean to a certain value of a model mean;
  
  performing Psychoacoustic Compensation on the initialized clean model mean to obtain a compensated clean model mean;
  
  determining if value of the compensated clean model mean is within a certain range of a model mean value;
  
  terminating the iteration in response to determining that the value of the compensated clean model mean is within a certain range of the model mean value; and
  
  updating clean parameters of the clean model mean in response to determining that the value of the compensated clean model mean is outside a certain range of the model mean value.

8. One or more non-transitory machine readable information storage mediums comprising one or more instructions which when executed by one or more hardware processors causes:
- acquiring of speech signals using a speech acquisition module (202);
  
  estimating of noise from said acquired speech signals using a noise estimation module (204), wherein estimation of noise can be performed through non-speech frames of the acquired speech signals;
  
  computing of noisy speech features from said acquired speech signals using a feature extraction module (206), wherein Mel-Frequency Cepstral Coefficients are used as said noisy speech features;
  
  estimating of noisy model parameters from said computed noisy speech features using a parameter estimation module (208);
  
  estimating of clean parameters from said estimated noise and said estimated noisy model parameters using a clean parameter estimation module (210), wherein the estimation is performed using reverse psychoacoustic compensation (RPC), wherein the RPC is performed iteratively by;
  
  initializing a clean model mean to a certain value of a model mean;
  
  performing Psychoacoustic Compensation on the initialized clean model mean to obtain a compensated clean model mean;
  
  determining if value of the compensated clean model mean is within a certain range of a model mean value;
  
  terminating the iteration in response to determining that the value of the compensated clean model mean is within a certain range of the model mean value; and
  
  updating clean parameters of the clean model mean in response to determining that the value of the compensated clean model mean is outside a certain range of the model mean value.
- View Dependent Claims (9, 10, 11, 12, 13)
- - 9. The one or more non-transitory machine readable information storage mediums of claim 8, wherein said speech acquisition module (202) further converts said acquired speech signals from analog to digital waveforms.
  - 10. The one or more non-transitory machine readable information storage mediums of claim 8, wherein said estimation of noise using the noise estimation module is performed during training phase.
  - 11. The one or more non-transitory machine readable information storage mediums of claim 8, wherein said estimated noise and said estimated noisy model parameters are first converted into their spectral domain representations.
  - 12. The one or more non-transitory machine readable information storage mediums of claim 8, wherein the estimated clean parameters are in their spectral domain representation.
  - 13. The one or more non-transitory machine readable information storage mediums of claim 12, wherein the estimated clean parameters are converted from their spectral domain representation to feature domain representation.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
TATA Consultancy Services Limited (Tata Sons Pvt Ltd.)
Original Assignee
TATA Consultancy Services Limited (Tata Sons Pvt Ltd.)
Inventors
Panda, Ashish, Kopparapu, Sunil Kumar
Primary Examiner(s)
Dorvil, Richemond
Assistant Examiner(s)
Villena, Mark

Application Number

US15/444,759
Publication Number

US 20170270952A1
Time in Patent Office

833 Days
Field of Search
US Class Current
CPC Class Codes

G10L 15/02   Feature extraction for spee...

G10L 15/20   Speech recognition techniqu...

G10L 17/02   Preprocessing operations, e...

G10L 17/20   Pattern transformations or ...

G10L 21/0208   Noise filtering

Method and system of estimating clean speech parameters from noisy speech parameters

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

13 Claims

Specification

Solutions

Use Cases

Quick Links

Method and system of estimating clean speech parameters from noisy speech parameters

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

13 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links