Continuous speech recognition method

US 4,227,176 A
Filed: 04/27/1978
Issued: 10/07/1980
Est. Priority Date: 04/27/1978
Status: Expired due to Term

First Claim

Patent Images

1. In a speech analysis system for recognizing at least one predetermined keyword in a continuous audio signal, each said keyword being characterized by a template having at least one target pattern, said target patterns having an ordered sequence and each target pattern representing a plurality of short-term power spectra spaced apart in real time, an analysis method comprising the steps ofrepeatedly evaluating a set of parameters determining a short-term power spectrum of said audio signal within each of a plurality of equal duration sampling intervals, thereby to generate a continuous time ordered sequence of short-term audio power spectrum frames,repeatedly selecting from said sequence of frames, one first frame and at least one later occurring frame to form a multi-frame pattern,comparing each thus formed multi-frame pattern with each first target pattern of each keyword template,deciding whether each said multi-frame pattern corresponds to a said first target pattern of a keyword template,for each multi-frame pattern which, according to said deciding step, corresponds to a said first target pattern of a potential candidate keyword, selecting later occurring spectrum frames to form later occurring multi-frame patterns,deciding whether said later occurring multi-frame patterns correspond respectively to successive target patterns of said potential candidate keyword template, andidentifying a candidate keyword template when said selected multi-frame patterns correspond respectively to the target patterns of a said keyword template.

View all claims

6 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A speech recognition method for detecting and recognizing one or more keywords in a continuous audio signal is disclosed. Each keyword is represented by a keyword template representing a plurality of target patterns, and each target pattern comprises statistics of each of a plurality of spectra selected from plural short-term spectra generated according to a predetermined system for processing of the incoming audio. The spectra are processed to enhance the separation between the spectral pattern classes during later analysis. The processed audio spectra are grouped into multi-frame spectral patterns and are compared by means of likelihood statistics with the target patterns of the keyword templates. A concatenation technique employing a loosely set detection threshold makes it very unlikely that a correct pattern will be rejected.

Citations

10 Claims

1. In a speech analysis system for recognizing at least one predetermined keyword in a continuous audio signal, each said keyword being characterized by a template having at least one target pattern, said target patterns having an ordered sequence and each target pattern representing a plurality of short-term power spectra spaced apart in real time, an analysis method comprising the steps ofrepeatedly evaluating a set of parameters determining a short-term power spectrum of said audio signal within each of a plurality of equal duration sampling intervals, thereby to generate a continuous time ordered sequence of short-term audio power spectrum frames,repeatedly selecting from said sequence of frames, one first frame and at least one later occurring frame to form a multi-frame pattern,comparing each thus formed multi-frame pattern with each first target pattern of each keyword template,deciding whether each said multi-frame pattern corresponds to a said first target pattern of a keyword template,for each multi-frame pattern which, according to said deciding step, corresponds to a said first target pattern of a potential candidate keyword, selecting later occurring spectrum frames to form later occurring multi-frame patterns,deciding whether said later occurring multi-frame patterns correspond respectively to successive target patterns of said potential candidate keyword template, andidentifying a candidate keyword template when said selected multi-frame patterns correspond respectively to the target patterns of a said keyword template.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
- - 2. The method of claim 1 further including the step ofstoring said generated short-term power spectra on a first in-first out basis wherebya block of said power spectra is available for analysis.
  - 3. The method of claim 1 wherein the spectrum frames comprising a multi-frame pattern correspond to sampling frame intervals of the audio signal spaced apart by a fixed number of sampling frame intervals.
  - 4. The method of claim 3 wherein said spectrum frames comprising a multi-frame pattern are spaced apart by at least the time duration of two frame intervals.
  - 5. The method of claim 4 wherein said time duration is 30 milliseconds.
  - 6. The method of claim 1 wherein each said multi-frame pattern comprises three spectrum frames.
  - 7. The method of claim 1 wherein sequential pairs of the spectrum frames comprising each said multi-frame pattern correspond to sampling intervals of the audio signal spaced apart by a varying number of sampling frame intervals.
  - 8. The method of claim 1 wherein said second selecting step comprises the step of selecting said second and each succeeding selected multi-frame pattern at times within predetermined time durations fixed with respect to the time of detection of said previously selected multi-frame pattern.
  - 9. The method of claim 2 further including the steps ofrepeatedly generating a peak spectrum corresponding to peak frequency band values of said short-term power spectra, andfor each short-term spectrum, dividing the amplitude value of each frequency band by the corresponding intensity value in the corresponding peak spectrum,thereby to generate frequency band equalized spectra corresponding to a compensated audio signal having the same maximum short-term energy content in each of the frequency bands comprising the spectra.
  - 10. The method of claim 1 further including the steps offor each short-term power spectrum S(f), generating the value A corresponding to the average of said set of parameters determining each said spectrum, where A equals ##EQU15## and the f_j represents the width of the successive frequency bands comprising the spectrum;
    - andnonlinearly scaling each spectrum by generating, for value S(f) in each frequency band, a transformed spectrum having a corresponding value S_s (f), wherein ##EQU16##

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Verbex Voice Systems, Inc. (Voxware, Inc.)
Original Assignee
Dialog Systems Incorporated
Inventors
Moshier, Stephen L.
Primary Examiner(s)
Boudreau, Leo H.

Application Number

US05/901,001
Time in Patent Office

894 Days
Field of Search

179/1 SA, 179/1 SB, 340/146.3 R
US Class Current

704/231
CPC Class Codes

G10L 15/00 Speech recognition G10L17/0...

Continuous speech recognition method

First Claim

6 Assignments

0 Petitions

Accused Products

Abstract

Citations

10 Claims

Specification

Solutions

Use Cases

Quick Links

Continuous speech recognition method

First Claim

6 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

10 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links