Discrimination of components of audio signals based on multiscale spectro-temporal modulations
First Claim
Patent Images
1. A method for discriminating sounds in an audio signal comprising the steps of:
- forming an auditory spectrogram from the audio signal, said auditory spectrogram characterizing a physiological response to sound represented by the audio signal;
establishing a plurality of modulation-selective filters tuned to a range of frequency and temporal modulations of said auditory spectrogram;
filtering said auditory spectrogram into a plurality of multidimensional, time-varying cortical response signals, each of said cortical response signals indicative of the frequency modulations of said auditory spectrogram over a corresponding predetermined range of scales and of the temporal modulations of said auditory spectrogram over a corresponding predetermined range of rates;
decomposing said cortical response signals into orthogonal multidimensional component signals;
said cortical response signals existing in a cubic representation of rate, scale, and frequency components prior to the step of decompositiom;
said orthogonal multidimensional component signals including multiple scales of time and spectral resolution;
truncating said orthogonal multidimensional component signals; and
classifying said truncated component signals to discriminate therefrom a signal corresponding to a predetermined sound.
4 Assignments
0 Petitions
Accused Products
Abstract
An audio signal (172) representative of an acoustic signal is provided to an auditory model (105). The auditory model (105) produces a high-dimensional feature set based on physiological responses, as simulated by the auditory model (105), to the acoustic signal. A multidimensional analyzer (106) orthogonalizes and truncates the feature set based on contributions by components of the orthogonal set to a cortical representation of the acoustic signal. The truncated feature set is then provided to classifier (108), where a predetermined sound is discriminated from the acoustic signal.
-
Citations
20 Claims
-
1. A method for discriminating sounds in an audio signal comprising the steps of:
-
forming an auditory spectrogram from the audio signal, said auditory spectrogram characterizing a physiological response to sound represented by the audio signal; establishing a plurality of modulation-selective filters tuned to a range of frequency and temporal modulations of said auditory spectrogram; filtering said auditory spectrogram into a plurality of multidimensional, time-varying cortical response signals, each of said cortical response signals indicative of the frequency modulations of said auditory spectrogram over a corresponding predetermined range of scales and of the temporal modulations of said auditory spectrogram over a corresponding predetermined range of rates; decomposing said cortical response signals into orthogonal multidimensional component signals;
said cortical response signals existing in a cubic representation of rate, scale, and frequency components prior to the step of decompositiom;
said orthogonal multidimensional component signals including multiple scales of time and spectral resolution;truncating said orthogonal multidimensional component signals; and classifying said truncated component signals to discriminate therefrom a signal corresponding to a predetermined sound. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A method for discriminating sounds in an acoustic signal comprising the steps of:
-
providing a known audio signal associated with a known sound having a known sound classification; forming a training auditory spectrogram from said known audio signal; establishing a plurality of modulation-selective filters tuned to a range of frequency and temporal modulations of said training auditory spectrogram; filtering said training auditory spectrogram into a plurality of multidimensional, time-varying training cortical response signals, each of said training cortical response signals indicative of the frequency modulations of said training auditory spectrogram over a corresponding predetermined range of scales and of the temporal modulations of said training auditory spectrogram over a corresponding predetermined range of rates; decomposing said training cortical response signals into orthogonal multidimensional component training signals;
said training cortical response signals existing in a cubic representation of rate, scale, and frequency components prior to the step of decomposition;
said orthogonal multidimensional component training signals including multiple scales of time and spectral resolution;determining a signal size corresponding to each of said orthogonal multidimensional component training signals, said signal size setting a size of said corresponding orthogonal multidimensional component training signal to retain for classification; truncating said orthogonal multidimensional component training signals to said signal size; classifying said truncated orthogonal multidimensional component training signals; comparing said classification of said truncated orthogonal multidimensional component training signals with a classification of said known sound; increasing said signal size and repeating the method at said training signal truncating step if said classification of said truncated orthogonal multidimensional component training signals does not match said classification of said known sound to within a predetermined tolerance; converting the acoustic signal to an audio signal; forming an auditory spectrogram from said audio signal, said auditory spectrogram characterizing a physiological response to sound represented by the audio signal; establishing a plurality of modulation-selective filters tuned to a range of frequency and temporal modulations of said auditory spectrogram; filtering said auditory spectrogram into a plurality of multidimensional, time-varying cortical response signals, each of said cortical response signals indicative of the frequency modulations of said auditory spectrogram over a corresponding predetermined range of scales and the temporal modulations of said auditory spectrogram over a corresponding predetermined range of rates; decomposing said cortical response signals into orthogonal multidimensional component signals;
said cortical response signals existing in a cubic representation of rate, scale, and frequency components prior to the step of decomposition;
said orthogonal multidimensional component signals including multiple scales of time and spectral resolution;truncating said orthogonal multidimensional component signals to said signal size; and classifying said truncated component signals to discriminate therefrom a signal corresponding to a predetermined sound. - View Dependent Claims (11, 12, 13, 14)
-
-
15. A system to discriminate sounds in an acoustic signal comprising:
-
an early auditory model execution unit operable to produce at an output thereof an auditory spectrogram of an audio signal provided as an input thereto, said audio signal being a representation of said acoustic signal; a cortical model execution unit coupled to said output of said auditory model execution unit so as to receive said auditory spectrogram and to produce therefrom at an output thereof a time-varying signal representative of a cortical response to the acoustic signal;
said cortical response signal existing in a cubic representation of rate, scale, and frequency components;a multi-linear analyzer coupled to said output of said cortical model execution unit and operable to determine a set of multidimensional orthogonal axes from said cortical representations, said multi-linear analyzer further operable to produce a reduced data set relative to said set of multidimensional orthogonal axes; and a classifier for determining speech from said reduced data set. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification