Method and apparatus for speech recognition

US 4,383,135 A
Filed: 01/23/1980
Issued: 05/10/1983
Est. Priority Date: 01/23/1980
Status: Expired due to Term

First Claim

Patent Images

1. A method for producing a template representing a speech signal, comprising the steps of:

extracting a plurality of frequency signal components (F_n) and amplitude signal components (A_n) from the speech signal,producing a frequency product signal which is proportional to the product (F_n) (F_n+1) of said frequency signal,producing a frequency ratio signal which is proportional to the ratio (F_n /F_n+1) of said frequency signals,producing an amplitude product signal which is proportional to the product (A_n) (A_n+1) of said amplitude signals,producing an amplitude ratio signal which is proportional to the ratio (A_n /A_n+1) of said amplitude signals, andstoring the frequency product signal, the frequency ratio signal, the amplitude product signal, and the amplitude ratio signal as the template representing the speech signal.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Recognition of human speech is carried out by storing a template for each unit of speech to produce a dictionary of stored words and phrases. A given speech signal is converted to produce a template which is compared to the stored template to find the closest comparison. The word or phrase corresponding to the identified template is produced and displayed to complete the recognition of the speech signal. The speech signal is processed to produce two separate frequency components. The frequency components are processed to produce a DC signal proportional to the frequency of the frequency component. The frequency components are also rectified to produce amplitude signals corresponding to the envelope of the frequency components. The products [F₁ ][F₂ ] and [A₁ ][A₂ ] and ratios F₁ /F₂ and A₁ /A₂ of the pairs of frequency and amplitude signals are produced to generate a plurality of relational signals which comprise the templates corresponding to each speech signal. In building the dictionary each speech sample is submitted a number of times to produce an average template value together with the variants for each data point.

45 Citations

View as Search Results

26 Claims

1. A method for producing a template representing a speech signal, comprising the steps of:
- extracting a plurality of frequency signal components (F_n) and amplitude signal components (A_n) from the speech signal,producing a frequency product signal which is proportional to the product (F_n) (F_n+1) of said frequency signal,producing a frequency ratio signal which is proportional to the ratio (F_n /F_n+1) of said frequency signals,producing an amplitude product signal which is proportional to the product (A_n) (A_n+1) of said amplitude signals,producing an amplitude ratio signal which is proportional to the ratio (A_n /A_n+1) of said amplitude signals, andstoring the frequency product signal, the frequency ratio signal, the amplitude product signal, and the amplitude ratio signal as the template representing the speech signal.

2. A method of producing a template representing a speech signal, comprising the steps of:
- extracting a plurality of frequency and amplitude signal components from the speech signal,producing a first relational signal which is the product of two of the frequency signal components,producing a second relational signal which is the product of two of the amplitude signal components,producing a third relational signal which is the ratio of two of the frequency signal components,producing a fourth relational signal which is the ratio of two of the amplitude signal components, andstoring the relational signals as a template representing the speech signal.
- View Dependent Claims (5)
- - 5. The method recited in claim 2 including the step of digitizing said relational signals for storage.

3. Apparatus for producing a template representing a speech signal comprising:
- means for filtering said speech signal to produce a plurality of output signals, each output signal corresponding to a different spectral region of the speech signal,means for detecting the zero crossings for each of the output signals from said means for filtering and generating a pulse train with a pulse occurring at each of the zero crossings,means for converting the pulse trains into frequency signals proportional to the frequency of one of the output signals,means for rectifying each of the output signals from said means for filtering and generating an amplitude signal proportional to the output signal,mathematical operational means connected to receive the frequency signals and the amplitude signals for producing one or more relational signals each of which is proportional to a plurality of either the frequency signals or the amplitude signals, and means for storing said relational signals as a template representing the speech signal.

4. Apparatus for producing a template representing a speech signal, comprising:
- means for producing a plurality of frequency signals (F_n) and a plurality of amplitude signals (A_n) derived from the speech signal,multiplication means connected to receive said frequency signals and said amplitude signals for producing a frequency product signal (F_n) (F_n+1) which is the product of said frequency signals and for producing an amplitude product signal (A_n) (A_n+1) which is the product of said amplitude signals,division means connected to receive frequency signals and the amplitude signals while producing a frequency ratio which is the ratio (F_n /F_n+1) of one of the frequency signals to another of the frequency signals and for producing an amplitude ratio signal which is the ratio (A_n /A_n+1) of one of the amplitude signals to another of the amplitude signals, andmeans for storing said product signals and said ratio signals as the template representing the speech signal.

6. A method for producing a template which represents a speech signal, comprising the steps of:
- extracting at least two frequency component signals from said speech signal,producing a frequency signal (F_n) from each of said frequency components, said frequency signals being proportional to the frequency of the respective frequency component,producing an amplitude signal (A_n) from each of said frequency components, said amplitude signals being proportional to the envelopes of the respective frequency component,producing a frequency product signal which is proportional to the product (F_n) (F_n+1) of said frequency signals,producing a frequency ratio signal which is proportional to the ratio (F_n /F_n+1) of said frequency signals,producing an amplitude product signal which is proportional to the product (A_n) (A_n+1) of said amplitude signals,producing an amplitude ratio signal which is proportional to the ratio (A_n /A_n+1) of said amplitude signals, andstoring said product and ratio signals to serve as a template which represents said speech signal.
- View Dependent Claims (7, 8, 9, 10)
- - 7. The method recited in claim 6 wherein the step of extracting comprises passing said speech signal through a plurality of band pass filters each passing a different spectral segment of said speech signal and one of said frequency component signals is produced at the output of each of said filters.
  - 8. The method recited in claim 6 wherein the step of producing a frequency signal comprises detecting the zero crossovers of said frequency component signals, producing a pulse for each detected crossover and filtering said pulses to produce said frequency signals.
  - 9. The method recited in claim 6 wherein the step of producing an amplitude signal comprises envelope detecting each of said frequency component signals to produce said amplitude signals.
  - 10. The method recited in claim 6 further including the step of digitizing said product and ratio signals prior to storing said product and ratio signals.

11. A method for producing a template which represents a speech signal, comprising the steps of:
- extracting one or more spectral component signals from said speech signal,quantizing said spectral component signals into sequential sample signal pulses, andproducing a relational signal corresponding to each spectral component signal from the signal pulses wherein the relational signal comprises a sequence of data points, each data point comprising the product of sequential sample signal pulses and the ratio of corresponding sequential signal pulses of said quantized spectral component signal in serial order to the first sequential sample signal pulse thereof, said relational signal comprising said template.
- View Dependent Claims (12, 13, 14)
- - 12. The method recited in claim 11 wherein the step of extracting comprises filtering said speech signal to produce said spectral component signals.
  - 13. The method recited in claim 11 wherein the step of quantizing said spectral component signals comprises digitizing said spectral component signals to produce digital words representing the sequential signal pulses.
  - 14. The method recited in claim 13 wherein the step of producing a relational signal comprises sequentially dividing each of said digital words making up a spectral component signal by the first digital word of a spectral component signal, said relational signal comprising the resulting sequence of digitized words.

15. A method for producing a generalized template representing a speech pattern signal comprising the steps of:
- (a) extracting a plurality of signal components from said speech pattern signal,(b) generating one or more relational signals for said speech pattern signal, said relational signals generated by producing a frequency signal (F_n) and an amplitude signal (A_n) proportional to each signal component and producing frequency and amplitude product and ratio signals proportional to said frequency and amplitude signals,(c) repeating said speech pattern signal,(d) repeating steps (a) and (b) for each repeat of the speech pattern signal,(e) averaging the relational signals of each of said speech pattern signals with the corresponding relational signals of the other speech pattern signals to produce an average relational signal,(f) measuring the deviation of said relational signal of step (b) from said average relational signal to produce a deviation signal for each of said relational signals of step (b), and(g) storing each of said average relational signals and the corresponding deviation signals.
- View Dependent Claims (16, 17)
- - 16. The method recited in claim 15 wherein the step of extracting a plurality of signal components comprises:
    - applying said speech pattern signals to a plurality of filters, each filter passing a selected spectrum component of said speech pattern signal,producing said frequency signal (F_n) from each signal generated at the outputs of said filters, said frequency signal proportional to the frequency of the output signal from the respective filter, andproducing said amplitude signal (A_n) for each of said output signals, each said amplitude signal proportional to the envelope of the output signal from the respective filter.
  - 17. The method as recited in claim 16 wherein the step of generating one or more relational signals comprises:
    - producing said frequency product signal which is proportional to the product (F_n) (F_n+1) of said frequency signals,producing said frequency ratio signal which is proportional to the ratio (F_n /F_n+1) of said frequency signals,producing said amplitude product signal which is proportional to the product (A_n) (A_n+1) of said amplitude signals, andproducing said amplitude ratio signal which is porportional to the ratio (A_n /A_n+1) of said amplitude signals.

18. A method for producing a template which represents a speech signal, comprising the steps of:
- extracting at least two frequency component signals from said speech signal,applying a first of said frequency component signals to the input terminal of a gated counter,applying a second of said frequency component signals to the gate terminal of said counter to cause the first frequency component signal to be counted when the gate f the counter is activated by said second frequency component signal, the first frequency component signal count thus produced being proportional to the ratio of the frequency of first frequency component signal to the frequency of the second frequency component signal,counting the second frequency component signal to produce a second frequency signal proportional to the frequency of the second frequency component signal,producing a first frequency signal proportional to the frequency of said first frequency component signal by multiplying the count of said first frequency component signal by said second frequency signal,producing an amplitude signal from each of said frequency component signals, said amplitude signals proportional to the envelopes of the respective frequency component signal, andproducing one or more relational signals, each of which is proportional to a plurality of signals selected from said frequency signals and said amplitude signals.
- View Dependent Claims (19)
- - 19. The method recited in claim 18 wherein the step of producing one or more relational signals comprises:
    - producing a frequency product signal which is proportional to the product of said frequency signals,producing a frequency ratio signal which is proportional to the ratio of said frequency signals,producing an amplitude product signal which is proportional to the product of said amplitude signals, andproducing an amplitude ratio signal which is proportional to the ratio of said amplitude signals.

20. A method of identifying a given speech signal by comparison with a dictionary of a plurality of templates of stored speech signals wherein each of the speech signals is converted to a template, comprising the steps of:
- extracting a plurality of signal components from each of the given speech signals,producing one or more relational signals, each relational signal comprising a frequency product signal, a frequency ratio signal, an amplitude product signal and an amplitude ratio signal for each of the given speech signals, each relational signal proportional to a plurality of said signal components derived from the corresponding speech signal, said relational signals comprising the templates for the given speech signals,storing the relational signal templates produced for the given speech signals included in said dictionary,comparing the relational signal template of a given speech signal with each template representing said dictionary of the plurality of stored speech signals to find the closest comparison, andidentifying the given speech signal with the closest comparison stored speech signal.
- View Dependent Claims (21, 22)
- - 21. The method recited in claim 20 including the steps of:
    - storing words represented by each of the stored speech signals included in said dictionary, anddisplaying the word represented by the identified given speech signal from said dictionary, the displayed word corresponding to the given speech signal.
  - 22. The method recited in claim 20 wherein the steps of extracting a plurality of signal components and producing one or more relational signals are repeated for each of a plurality of utterances for each given speech signal in said dictionary, the method further including averaging the relational signals for each utterance of each given speech signal to produce said templates, recording the deviations of the relational signals for each utterance from the averaged relational signals, and utilizing said deviations to determine the closest comparison of the template for the given speech signal to the templates in said dictionary.

23. Apparatus for producing a template to represent an audio signal comprising:
- filter means connected to receive said audio signal for passing the high frequency component thereof,filter means connected to receive said audio signal for passing the low frequency component thereof,means connected to receive said frequency components for producing a frequency signal for each frequency component, each said frequency signal proportional to the predominant frequency of the corresponding frequency component,means connected to receive said high frequency component for producing an amplitude signal corresponding to the envelope of said high frequency component,means connected to receive said low frequency component for producing an amplitude signal corresponding to the envelope of said low frequency signal,multiplication means connected to receive said frequency signals and said amplitude signals for producing a frequency product signal which is proportional to the product of said frequency signals and for producing an amplitude product signal which is proportional to the product of said amplitude signals,division means connected to receive said frequency signals and amplitude signals for producing a frequency ratio signal which is proportional to the ratio of one of said frequency signals to another of said frequency signals and for producing an amplitude ratio signal which is proportional to the ratio of one of said amplitude signals to another of said amplitude signals, andmeans for receiving and storing said product and ratio signals as a template to represent said audio signal.
- View Dependent Claims (24, 25)
- - 24. The apparatus recited in claim 23 wherein said means connected to receive said frequency components for producing a frequency signal for each frequency component comprises:
    - means for detecting the zero crossings of each of said frequency components,means for generating a pulse for each of said detected zero crossings to produce a pulse train for each of said frequency components, andmeans for integrating each of said pulse trains to produce said frequency signals.
  - 25. Apparatus as recited in claim 23 wherein said means connected to receive said frequency components for producing a frequency signal for each frequency component comprises:
    - a first squaring circuit connected to receive said high frequency component and produce a pulsed signal corresponding in frequency to said high frequency component,a second squaring circuit connected to receive said low frequency component and produce a pulsed signal corresponding in frequency to said low frequency component,a gated counter connected to receive and count said pulsed signal from said first squaring circuit, the gate of said counter connected to be operated by said pulsed signal from said first squaring circuit, said gated counter producing a digitized signal proportional to the ratio of said frequency signals,a clocked counter connected to receive and count said pulsed signal from said second squaring circuit, said clocked counter producing a digitized signal proportional to the frequency of said low frequency signal, andmeans for multiplying said digitized ratio signal by said digitized signal proportional to frequency to produce a digitized signal proportional to said high frequency component.

26. A method for producing a template to represent a speech signal, comprising the steps of:
- filtering said speech signal by means of an array of filters each passing a different spectral region of said speech signal,producing a frequency signal Fn for each signal produced at the output of said filters, each said frequency signal being proportional to the frequency of the output signal from the respective filter,producing an amplitude signal An for each of said signals produced at the output of said filters, each said amplitude signal being proportional to the envelope of the output signal from the respective filter,producing one or more relational signals each of which is proportional to a plurality of said frequency signals,producing one or more relational signals each of which is proportional to a plurality of said amplitude signals, andstoring said relational signals to serve as a template representing the speech signal.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Scott Instruments Company Denton TX
Original Assignee
Scott Instruments Company Denton TX
Inventors
Scott, Brian L., Hardesty, Lee H.
Primary Examiner(s)
KEMENY, EMANUEL

Application Number

US06/114,724
Time in Patent Office

1,203 Days
Field of Search

179/1 SD, 179/1 SA, 179/1 SC, 179/1 SB, 340/146.3 WD, 340/148, 364/482, 364/724
US Class Current

704/236
CPC Class Codes

G10L 15/00 Speech recognition G10L17/0...

Method and apparatus for speech recognition

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

45 Citations

26 Claims

Specification

Solutions

Use Cases

Quick Links

Method and apparatus for speech recognition

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

45 Citations

26 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links