×

Preprocessing system for speech recognition

  • US 5,054,085 A
  • Filed: 11/19/1990
  • Issued: 10/01/1991
  • Est. Priority Date: 05/18/1983
  • Status: Expired due to Fees
First Claim
Patent Images

1. A system for preprocessing the speech of a speaker to provide a normalized signal for subsequent processing, said system comprising:

  • means for generating speaker specific gain settings, speaker specific spectral settings, speaker specific pitch settings and speaker specific peak normalization settings for the speech of a particular speaker, said settings being generated during an enrollment for said particular speaker wherein words spoken during said enrollment may be a different set relative to words spoken by said speaker after said enrollment;

    means coupled to said generating means for generating said normalized signal using said settings, which normalized signal represents the speech of the speaker which is to be processed;

    wherein said means for generating said speaker specific settings comprises;

    a) gain enrollment means for generating said speaker specific settings of the gain for controlling an overall signal level;

    b) spectral and pitch enrollment means for generating said speaker specific spectral settings and said speaker specific pitch settings;

    c) peak normalization enrollment means for generating said speaker specific peak normalization settings;

    wherein said normalized signal includes a set of parameters, said set of parameters including spectral parameters, temporal parameters, pitch parameters, said normalized signal further including a nasal energy signal, an oral energy signal and a pitch epoch timing signal, and wherein said normalized signal generating means includes data acquisition means for generating from the speech of the speaker said oral energy signal, said nasal energy signal and an oral amplitude signal, wherein said oral amplitude signal is input to;

    (i) spectral analyzer means for generating said spectral parameters;

    (ii) temporal analyzer means for generating said temporal parameters; and

    (iii) pitch analyzer means for generating said pitch parameters and said pitch epoch timing signal and wherein said data acquisition means comprises;

    (a) an oral microphone for converting sound emanating from the speaker'"'"'s mouth into a first electrical signal;

    (b) a nasal microphone for converting sound emanating from the speaker'"'"'s nose into a second electrical signal;

    (c) first gain control means coupled to said oral microphone for producing a digitally controlled gain of said first electrical signal;

    (d) second gain control means coupled to said nasal microphone for producing a digitally controlled gain of said second electrical signal;

    (e) first band limiting means coupled to said first gain control means for producing a voiced band oral amplitude signal from said gain controlled first electrical signal;

    (f) second band limiting means coupled to said second gain control means for producing a voiced band nasal amplitude signal from said gain controlled second electrical signal;

    (g) energy computation means coupled to said first and second band limiting means for performing a wide band RMS to DC conversion on the output from each of said first and second band limiting means;

    (h) first filter means coupled to said first band limiting means for producing a low pass Nyquist filtered output from said voiced band oral amplitude signal;

    (i) second filter means coupled to said energy computation means for producing a low pass Nyquist filtered output from each of said DC converted outputs from said energy computation means;

    (j) analog to digital converter means coupled to said first and second filter means for generating a digitalized oral amplitude signal from the output of said first filter means, and a digitalized oral energy signal and a digitalized nasal energy signal from the outputs of said second filter means; and

    wherein said means for generating said speaker specific settings comprises;

    a) gain enrollment means for generating said speaker specific settings of the gain for controlling an overall signal level;

    b) spectral and pitch enrollment means for generating said speaker specific spectral settings and said speaker specific pitch settings;

    (c) peak normalization enrollment means for generating said speaker specific peak normalization settings.

View all claims
  • 0 Assignments
Timeline View
Assignment View
    ×
    ×