Method and apparatus for speech recognition
First Claim
1. A method for producing a template representing a speech signal, comprising the steps of:
- extracting a plurality of frequency signal components (Fn) and amplitude signal components (An) from the speech signal,producing a frequency product signal which is proportional to the product (Fn) (Fn+1) of said frequency signal,producing a frequency ratio signal which is proportional to the ratio (Fn /Fn+1) of said frequency signals,producing an amplitude product signal which is proportional to the product (An) (An+1) of said amplitude signals,producing an amplitude ratio signal which is proportional to the ratio (An /An+1) of said amplitude signals, andstoring the frequency product signal, the frequency ratio signal, the amplitude product signal, and the amplitude ratio signal as the template representing the speech signal.
1 Assignment
0 Petitions
Accused Products
Abstract
Recognition of human speech is carried out by storing a template for each unit of speech to produce a dictionary of stored words and phrases. A given speech signal is converted to produce a template which is compared to the stored template to find the closest comparison. The word or phrase corresponding to the identified template is produced and displayed to complete the recognition of the speech signal. The speech signal is processed to produce two separate frequency components. The frequency components are processed to produce a DC signal proportional to the frequency of the frequency component. The frequency components are also rectified to produce amplitude signals corresponding to the envelope of the frequency components. The products [F1 ][F2 ] and [A1 ][A2 ] and ratios F1 /F2 and A1 /A2 of the pairs of frequency and amplitude signals are produced to generate a plurality of relational signals which comprise the templates corresponding to each speech signal. In building the dictionary each speech sample is submitted a number of times to produce an average template value together with the variants for each data point.
45 Citations
26 Claims
-
1. A method for producing a template representing a speech signal, comprising the steps of:
-
extracting a plurality of frequency signal components (Fn) and amplitude signal components (An) from the speech signal, producing a frequency product signal which is proportional to the product (Fn) (Fn+1) of said frequency signal, producing a frequency ratio signal which is proportional to the ratio (Fn /Fn+1) of said frequency signals, producing an amplitude product signal which is proportional to the product (An) (An+1) of said amplitude signals, producing an amplitude ratio signal which is proportional to the ratio (An /An+1) of said amplitude signals, and storing the frequency product signal, the frequency ratio signal, the amplitude product signal, and the amplitude ratio signal as the template representing the speech signal.
-
-
2. A method of producing a template representing a speech signal, comprising the steps of:
-
extracting a plurality of frequency and amplitude signal components from the speech signal, producing a first relational signal which is the product of two of the frequency signal components, producing a second relational signal which is the product of two of the amplitude signal components, producing a third relational signal which is the ratio of two of the frequency signal components, producing a fourth relational signal which is the ratio of two of the amplitude signal components, and storing the relational signals as a template representing the speech signal. - View Dependent Claims (5)
-
-
3. Apparatus for producing a template representing a speech signal comprising:
-
means for filtering said speech signal to produce a plurality of output signals, each output signal corresponding to a different spectral region of the speech signal, means for detecting the zero crossings for each of the output signals from said means for filtering and generating a pulse train with a pulse occurring at each of the zero crossings, means for converting the pulse trains into frequency signals proportional to the frequency of one of the output signals, means for rectifying each of the output signals from said means for filtering and generating an amplitude signal proportional to the output signal, mathematical operational means connected to receive the frequency signals and the amplitude signals for producing one or more relational signals each of which is proportional to a plurality of either the frequency signals or the amplitude signals, and means for storing said relational signals as a template representing the speech signal.
-
-
4. Apparatus for producing a template representing a speech signal, comprising:
-
means for producing a plurality of frequency signals (Fn) and a plurality of amplitude signals (An) derived from the speech signal, multiplication means connected to receive said frequency signals and said amplitude signals for producing a frequency product signal (Fn) (Fn+1) which is the product of said frequency signals and for producing an amplitude product signal (An) (An+1) which is the product of said amplitude signals, division means connected to receive frequency signals and the amplitude signals while producing a frequency ratio which is the ratio (Fn /Fn+1) of one of the frequency signals to another of the frequency signals and for producing an amplitude ratio signal which is the ratio (An /An+1) of one of the amplitude signals to another of the amplitude signals, and means for storing said product signals and said ratio signals as the template representing the speech signal.
-
-
6. A method for producing a template which represents a speech signal, comprising the steps of:
-
extracting at least two frequency component signals from said speech signal, producing a frequency signal (Fn) from each of said frequency components, said frequency signals being proportional to the frequency of the respective frequency component, producing an amplitude signal (An) from each of said frequency components, said amplitude signals being proportional to the envelopes of the respective frequency component, producing a frequency product signal which is proportional to the product (Fn) (Fn+1) of said frequency signals, producing a frequency ratio signal which is proportional to the ratio (Fn /Fn+1) of said frequency signals, producing an amplitude product signal which is proportional to the product (An) (An+1) of said amplitude signals, producing an amplitude ratio signal which is proportional to the ratio (An /An+1) of said amplitude signals, and storing said product and ratio signals to serve as a template which represents said speech signal. - View Dependent Claims (7, 8, 9, 10)
-
-
11. A method for producing a template which represents a speech signal, comprising the steps of:
-
extracting one or more spectral component signals from said speech signal, quantizing said spectral component signals into sequential sample signal pulses, and producing a relational signal corresponding to each spectral component signal from the signal pulses wherein the relational signal comprises a sequence of data points, each data point comprising the product of sequential sample signal pulses and the ratio of corresponding sequential signal pulses of said quantized spectral component signal in serial order to the first sequential sample signal pulse thereof, said relational signal comprising said template. - View Dependent Claims (12, 13, 14)
-
-
15. A method for producing a generalized template representing a speech pattern signal comprising the steps of:
-
(a) extracting a plurality of signal components from said speech pattern signal, (b) generating one or more relational signals for said speech pattern signal, said relational signals generated by producing a frequency signal (Fn) and an amplitude signal (An) proportional to each signal component and producing frequency and amplitude product and ratio signals proportional to said frequency and amplitude signals, (c) repeating said speech pattern signal, (d) repeating steps (a) and (b) for each repeat of the speech pattern signal, (e) averaging the relational signals of each of said speech pattern signals with the corresponding relational signals of the other speech pattern signals to produce an average relational signal, (f) measuring the deviation of said relational signal of step (b) from said average relational signal to produce a deviation signal for each of said relational signals of step (b), and (g) storing each of said average relational signals and the corresponding deviation signals. - View Dependent Claims (16, 17)
-
-
18. A method for producing a template which represents a speech signal, comprising the steps of:
-
extracting at least two frequency component signals from said speech signal, applying a first of said frequency component signals to the input terminal of a gated counter, applying a second of said frequency component signals to the gate terminal of said counter to cause the first frequency component signal to be counted when the gate f the counter is activated by said second frequency component signal, the first frequency component signal count thus produced being proportional to the ratio of the frequency of first frequency component signal to the frequency of the second frequency component signal, counting the second frequency component signal to produce a second frequency signal proportional to the frequency of the second frequency component signal, producing a first frequency signal proportional to the frequency of said first frequency component signal by multiplying the count of said first frequency component signal by said second frequency signal, producing an amplitude signal from each of said frequency component signals, said amplitude signals proportional to the envelopes of the respective frequency component signal, and producing one or more relational signals, each of which is proportional to a plurality of signals selected from said frequency signals and said amplitude signals. - View Dependent Claims (19)
-
-
20. A method of identifying a given speech signal by comparison with a dictionary of a plurality of templates of stored speech signals wherein each of the speech signals is converted to a template, comprising the steps of:
-
extracting a plurality of signal components from each of the given speech signals, producing one or more relational signals, each relational signal comprising a frequency product signal, a frequency ratio signal, an amplitude product signal and an amplitude ratio signal for each of the given speech signals, each relational signal proportional to a plurality of said signal components derived from the corresponding speech signal, said relational signals comprising the templates for the given speech signals, storing the relational signal templates produced for the given speech signals included in said dictionary, comparing the relational signal template of a given speech signal with each template representing said dictionary of the plurality of stored speech signals to find the closest comparison, and identifying the given speech signal with the closest comparison stored speech signal. - View Dependent Claims (21, 22)
-
-
23. Apparatus for producing a template to represent an audio signal comprising:
-
filter means connected to receive said audio signal for passing the high frequency component thereof, filter means connected to receive said audio signal for passing the low frequency component thereof, means connected to receive said frequency components for producing a frequency signal for each frequency component, each said frequency signal proportional to the predominant frequency of the corresponding frequency component, means connected to receive said high frequency component for producing an amplitude signal corresponding to the envelope of said high frequency component, means connected to receive said low frequency component for producing an amplitude signal corresponding to the envelope of said low frequency signal, multiplication means connected to receive said frequency signals and said amplitude signals for producing a frequency product signal which is proportional to the product of said frequency signals and for producing an amplitude product signal which is proportional to the product of said amplitude signals, division means connected to receive said frequency signals and amplitude signals for producing a frequency ratio signal which is proportional to the ratio of one of said frequency signals to another of said frequency signals and for producing an amplitude ratio signal which is proportional to the ratio of one of said amplitude signals to another of said amplitude signals, and means for receiving and storing said product and ratio signals as a template to represent said audio signal. - View Dependent Claims (24, 25)
-
-
26. A method for producing a template to represent a speech signal, comprising the steps of:
-
filtering said speech signal by means of an array of filters each passing a different spectral region of said speech signal, producing a frequency signal Fn for each signal produced at the output of said filters, each said frequency signal being proportional to the frequency of the output signal from the respective filter, producing an amplitude signal An for each of said signals produced at the output of said filters, each said amplitude signal being proportional to the envelope of the output signal from the respective filter, producing one or more relational signals each of which is proportional to a plurality of said frequency signals, producing one or more relational signals each of which is proportional to a plurality of said amplitude signals, and storing said relational signals to serve as a template representing the speech signal.
-
Specification