×

Speech recognition apparatus and method in noisy circumstances

  • US 5,749,068 A
  • Filed: 10/17/1996
  • Issued: 05/05/1998
  • Est. Priority Date: 03/25/1996
  • Status: Expired due to Term
First Claim
Patent Images

1. A speech recognition apparatus for recognizing an input speech under noisy circumstances comprising:

  • a noise model memory for storing a noise model;

    a speech model memory for storing a noise-free speech model;

    a reference model memory for storing a plurality of speech models for collation;

    an acoustic analyzer for receiving the input speech, acoustically analyzing a noise-superimposed speech signal of the input speech, and outputting a time-series feature vector of noise-superimposed speech;

    a superimposed-noise estimating unit for estimating a superimposed noise based on the time-series feature vector of noise-superimposed speech by using the noise model stored in the noise model memory and the noise-free speech model stored in the speech model memory, and outputting an estimated superimposed-noise spectrum;

    a spectrum calculator for receiving the input speech, analyzing a spectrum of the noise-superimposed speech signal of the input speech, and outputting a time-series noise-superimposed speech spectrum;

    a noise spectrum eliminator for eliminating a spectrum component of a noise speech in the noise-superimposed speech signal for the time-series noise-superimposed speech spectrum output from the spectrum calculator by using the estimated superimposed-noise spectrum output from the superimposed-noise estimating unit, and outputting a time-series noise-eliminated speech spectrum;

    a feature vector calculator for calculating a first feature vector from the time-series noise-eliminated speech spectrum and outputting a time-series feature vector of noise-eliminated speech; and

    a collating unit for collating the time-series feature vector of noise-eliminated speech with the plurality of speech models for collation stored in the reference model memory, selecting a speech model out of the plurality of speech models for collation, whose likelihood is highest, and outputting the speech model as a recognition result.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×