High-accuracy, low-distortion time-frequency analysis of signals using rotated-window spectrograms

US 5,845,241 A
Filed: 09/04/1996
Issued: 12/01/1998
Est. Priority Date: 09/04/1996
Status: Expired due to Term

First Claim

Patent Images

1. A speech recognition apparatus, comprising:

a spectral shaping source for generating a plurality of digital signal samples representative of an input speech signal;

a signal processor coupled to said source, comprising;

means for transforming said plurality of said signal samples to pre-processed signals representative of the frequency domain at various angular orientations;

means for generating initial time-frequency distributions of said pre-processed signals using analysis windows; and

means for rotating said time-frequency distributions back by said various angular orientations for generating a plurality of rotated window spectrograms, andsignal modeling apparatus for comparing the plurality of rotated window spectrograms against each of a plurality of word models and identifying the closest match.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A speech processing and analysis apparatus and method for generating a time-frequency distribution of a speech signal combines a set of spectrograms with varying window lengths and orientations to provide a parameter-less time-frequency distribution having good joint time and frequency resolution at all angular orientations. The analysis window of a spectrogram is rotated relative to the frequency components of the signal by preprocessing using a Fractional Fourier Transform to form rotated window spectrograms. In particular, to form the rotated window spectrogram, the signal is initially pre-processed using a Fractional Fourier Transform of angle α, the spectrogram time-frequency distribution of the pre-processed signal is then computed using analysis window h(t) and then rotated by angle -α. The geometric mean of a set of rotated window spectrograms, which are indexed by both the analysis window length and the angular orientation of the window relative to the signal'"'"'s time-frequency features, is then computed to form a combination of rotated window spectrograms.

Citations

30 Claims

1. A speech recognition apparatus, comprising:
- a spectral shaping source for generating a plurality of digital signal samples representative of an input speech signal;
  
  a signal processor coupled to said source, comprising;
  
  means for transforming said plurality of said signal samples to pre-processed signals representative of the frequency domain at various angular orientations;
  
  means for generating initial time-frequency distributions of said pre-processed signals using analysis windows; and
  
  means for rotating said time-frequency distributions back by said various angular orientations for generating a plurality of rotated window spectrograms, andsignal modeling apparatus for comparing the plurality of rotated window spectrograms against each of a plurality of word models and identifying the closest match.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16)
- - 2. The apparatus claimed in claim 1, wherein said processor means, further comprises:
    - means for combining said plurality of rotated window spectrograms to form a parameter-less combined rotated window spectrogram.
  - 3. The apparatus claimed in claim 2, wherein said means for transforming said plurality of said signal samples to pre-processed signals representative of the frequency domain at various angular orientations, further comprises:
    - Fractional Fourier transform means for generating pre-processed signals representative of the frequency domain at various angular orientations.
  - 4. The apparatus claimed in claim 3, wherein said Fractional Fourier transform means for generating pre-processed signals representative of the frequency domain at various angular orientations, further comprises:
    - means for pre-processing using the Fractional Fourier transform of an angle equal to the angular orientation of the window relative to said signal'"'"'s time-frequency characteristics.
  - 5. The apparatus claimed in claim 4, wherein said means for generating initial time-frequency distributions of said pre-processed signals using analysis windows, further comprises:
    - spectrogram means for generating time-frequency distributions of said pre-processed signals.
  - 6. The apparatus claimed in claim 5, wherein said means for combining said plurality of rotated window spectrograms to form a parameter-less combined rotated window spectrogram, further comprises:
    - means for computing a geometric mean of said plurality of rotated window spectrograms.
  - 7. The apparatus claimed in claim 6, wherein said means for computing a geometric mean of said plurality of rotated window spectrograms, further comprises:
    - means for indexing the analysis window length and angular orientation of said plurality of rotated window spectrograms relative to time and frequency.
  - 8. The apparatus claimed in claim 7, wherein said Fractional Fourier Transform means rotates the time-frequency components of said signal samples until they match 0 and π
    - /2.
  - 9. The apparatus claimed in claim 8, wherein said source for generating a plurality of digital signal samples representative of said signal, further comprises:
    - convertor means for digitally converting said signal.
  - 10. The apparatus claimed in claim 9, wherein said apparatus is utililized for speech processing.
  - 11. The apparatus claimed in claim 10, further comprising:
    - signal modeling means for generating word models in response to said combined rotated window spectrograms.
  - 12. The apparatus claimed in claim 11, wherein said speech processing includes speech recognition.
  - 13. The apparatus claimed in claim 11, wherein said speech processing includes includes speaker verification.
  - 14. The apparatus claimed in claim 9, wherein said signal is a diagnostic signal.
  - 15. The apparatus claimed in claim 9, wherein said signal is a sonar signal.
  - 16. The apparatus claimed in claim 9, wherein said signal is a radar signal.

17. A speech recognition method, comprising the steps of:
- capturing a plurality of digital signal samples representative of an input speech signal;
  
  generating a plurality of rotated window spectrograms from said signal samples, comprising the steps of;
  
  transforming said plurality of said signal samples to pre-processed signals representative of the frequency domain at various angular orientations using Fractional Fourier Transform means;
  
  generating initial time-frequency distributions of said pre-processed signals using analysis windows; and
  
  rotating said time-frequency distributions back by said various angular orientations, andgenerating word models in response to said plurality of rotated window spectrograms.
- View Dependent Claims (18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30)
- - 18. The method claimed in claim 17, wherein said step of generating a plurality of rotated window spectrograms further comprises the step of:
    - combining said plurality of rotated window spectrograms to form a parameter-less combined rotated window spectrogram.
  - 19. The method claimed in claim 18, wherein said step of transforming comprises:
    - pre-processing using the Fractional Fourier Transform of an angle equal to the angular orientation of the window relative to the time-frequency characteristics of said signal samples.
  - 20. The method claimed in claim 19, wherein said step of generating initial time-frequency distributions of said pre-processed signals using analysis windows, further comprises the step of:
    - generating time-frequency distributions of said pre-processed signals using spectrogram means.
  - 21. The method claimed in claim 20, wherein said step of combining said plurality of rotated window spectrograms to form a parameter-less combined rotated window spectrogram, further comprises the step of:
    - computing a geometric mean of said plurality of rotated window spectrograms.
  - 22. The method claimed in claim 21, wherein said step of computing a geometric mean of said plurality of rotated window spectrograms, further comprises the step of:
    - indexing the analysis window length and angular orientation of said plurality of rotated window spectrograms relative to time and frequency.
  - 23. The method claimed in claim 22, wherein said step of transforming, further comprises the step of:
    - rotating the time-frequency components of said signal samples until they match 0 and π
      
      /2 angles.
  - 24. The method claimed in claim 23, wherein said step of capturing a plurality of digital signal samples representative of said input speech signal, further comprises the step of:
    - digitally converting said input speech signal.
  - 25. The method claimed in claim 24, wherein said signal is a diagnostic signal.
  - 26. The method claimed in claim 24, wherein said signal is a sonar signal.
  - 27. The method claimed in claim 24, wherein said signal is a radar signal.
  - 28. The method claimed in claim 19, wherein said step of generating word models comprises generating word models in response to said combined rotated window spectrograms.
  - 29. The method claimed in claim 28, further comprising the step of comparing said word models generated in response to said combined rotated window spectrograms against each of a plurality of word models to identify the closest match.
  - 30. The method claimed in claim 29, further including a step of speaker verification using the word model identified as the closest match.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Hughes Electronics Corporation (AT&T, Inc.)
Original Assignee
Hughes Electronics Corporation (AT&T, Inc.)
Inventors
Owechko, Yuri
Primary Examiner(s)
Dorvil, Richemond

Application Number

US08/707,540
Time in Patent Office

818 Days
Field of Search

704/215, 704/200, 704/205, 704/268, 704/276, 704/251, 704/203, 704/231, 704/229, 704/230
US Class Current

704/203
CPC Class Codes

G06F 17/141   Discrete Fourier transforms

G10L 25/18   the extracted parameters be...

G10L 25/48   specially adapted for parti...

High-accuracy, low-distortion time-frequency analysis of signals using rotated-window spectrograms

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

30 Claims

Specification

Solutions

Use Cases

Quick Links

High-accuracy, low-distortion time-frequency analysis of signals using rotated-window spectrograms

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

30 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links