Speech recognition microcomputer

US 4,388,495 A
Filed: 05/01/1981
Issued: 06/14/1983
Est. Priority Date: 05/01/1981
Status: Expired due to Term

First Claim

Patent Images

1. Speech recognition apparatus, comprising:

speech responsive input means for generating an AC signal having a frequency determined by said speech;

a detector for producing digital signals by comparing said AC signal with a threshold signal level;

means for defining time intervals;

means responsive to said digital signals and said time intervals for counting said digital signals within said time intervals;

means responsive to said counting means for comparing the output of said digital signal counting means with plural count thresholds and for thereby categorizing said speech during said time intervals as fricative-like intervals, vowel-like intervals, or silence intervals;

means for counting said fricative-like intervals, vowel-like intervals, and silence intervals to generate interval counts;

means responsive to said interval counts for generating a fricative-like, vowel-like, and silence state sequence for said speech;

means for varying at least one of said plural count thresholds in response to the previous state in said state sequence for said speech;

means for comparing said state sequence with plural state sequence templates to identify a match; and

means for generating an output signal identifying one of said state sequence templates which is identified as an exact match.

View all claims

8 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A simplified, speaker independent, selected vocabulary, word recognizing microcomputer functions without the use of a typical front end filtering network. The microcomputer identifies vowel-like fricative-like, and silence signal states within a word or phrase by counting speech pattern zero crossings during sequential time periods. Variable zero crossing count thresholds are used to identity states based upon previously identified states, and histeresis is provided, through the use of state time measurement, to prevent state oscillations which would result in erroneous state sequences. The microcomputer, by monitoring zero crossings, defines words as a sequence of vowel-like, fricative-like, and silence states. By limiting the recognizable vocabulary to words which have dissimilar sequences, the incoming speech pattern may be recognized by comparison with state templates defining the limited vocabulary stored in the microcomputer'"'"'s memory.

55 Citations

View as Search Results

20 Claims

1. Speech recognition apparatus, comprising:
- speech responsive input means for generating an AC signal having a frequency determined by said speech;
  
  a detector for producing digital signals by comparing said AC signal with a threshold signal level;
  
  means for defining time intervals;
  
  means responsive to said digital signals and said time intervals for counting said digital signals within said time intervals;
  
  means responsive to said counting means for comparing the output of said digital signal counting means with plural count thresholds and for thereby categorizing said speech during said time intervals as fricative-like intervals, vowel-like intervals, or silence intervals;
  
  means for counting said fricative-like intervals, vowel-like intervals, and silence intervals to generate interval counts;
  
  means responsive to said interval counts for generating a fricative-like, vowel-like, and silence state sequence for said speech;
  
  means for varying at least one of said plural count thresholds in response to the previous state in said state sequence for said speech;
  
  means for comparing said state sequence with plural state sequence templates to identify a match; and
  
  means for generating an output signal identifying one of said state sequence templates which is identified as an exact match.
- View Dependent Claims (2, 3, 4, 5, 17, 18)
- - 2. Speech recognition apparatus, as defined in claim 1, wherein said speech responsive input means comprises:
    - a high gain amplifier driven to saturation by said speech.
  - 3. Speech recognition apparatus, as defined in claim 2, wherein said speech responsive means additionally comprises:
    - a microphone for providing a speech input to said amplifier.
  - 4. Speech recognition apparatus, as defined in claim 1, wherein said threshold signal level of said detector is a constant voltage level.
  - 5. Speech recognition apparatus, as defined in claim 1, wherein said means for generating an output signal identifying one of said state sequence templates generates identical output signals for plural ones of said state sequence templates.
  - 17. A speech recognition apparatus as defined in claim 1, wherein said means responsive to said counting means for categorizing said speech during said time intervals is also responsive to the location of an interval state within a state sequence.
  - 18. A speech recognition apparatus, as defined in claim 1, wherein said means responsive to said counting means for categorizing said speech during said time intervals, has plural, predetermined durational thresholds for categorizing said intervals.

6. A speech recognition system, comprising:
- a read-only memory permanently storing speech template data for a limited vocabulary, said speech template data defining words selected to be dissimilar;
  
  means for analyzing speech input data, said means comprising;
  
  means for measuring the average speech frequency of said data;
  
  means for comparing said average speech frequency with thresholds to generate speech state sequences of fricative-like, vowel-like, and silence states; and
  
  means for changing at least one of said thresholds in response to a previous state of said speech state sequence; and
  
  means for comparing said speech state sequences with said speech template data to output a signal identifying said speech.
- View Dependent Claims (7, 8)
- - 7. A speech recognition system, as defined in claim 6, wherein said read-only memory permanently stores speech template data which defines said limited vocabulary in terms of fricative-like, vowel-like, and silence states in a sequence.
  - 8. A speech recognition system, as defined in claim 6, wherein said read-only memory permanently stores plural speech template data defining one of said words selected to be dissimilar.

9. A method for recognizing speech signals, comprising;
- providing an analog electrical signal identifying the frequency content of said speech signals;
  
  first comparing said analog electrical signal with a threshold level to provide digital signals when said analog electrical signals cross said threshold level;
  
  first counting said digital signals during plural predetermined time increments to generate a digital count signal;
  
  second comparing said digital count signal with plural count thresholds to identify the average frequency content of said speech signals during said plural time increments;
  
  second counting successive ones of said plural time increments which have similar average frequency content;
  
  third comparing the number of successive ones of said plural time increments which have similar average frequency content with a predetermined count to define a state for said speech signals and to thereby provide a state sequence for said speech signals;
  
  varying said predetermined count in response to the location of said speech signal state within said state sequence; and
  
  fourth comparing said state sequence with plural stored state sequence templates to recognize said speech signals.
- View Dependent Claims (10, 11, 12)
- - 10. A method for recognizing speech signals as defined in claim 9 wherein said providing step comprises amplifying said analog electrical signal in a high-grain, hard limited amplifier.
  - 11. A method for recognizing speech signals as defined in claim 9 wherein said third comparing step compares said state sequence with plural state sequence templates permanently stored in a read-only memory.
  - 12. A method for recognizing speech signals as defined in claim 9 additionally comprising:
    - eliminating terminal states from said state sequence in response to said second counting step.

13. In a programmed computer system for recognizing human speech patterns in response to analog signals identifying the frequency of said analog signals, a data structure for comparing said speech patterns with stored templates, comprising:
- a. first means in said data structure and responsive to said analog signals for storing first coded signals indicative of the average frequency of said analog signals during successive time intervals;
  
  b. second means in said data structure responsive to said first coded signals for comparing said first coded signals with count threshold values stored in said data structure to identify a frequency characteristic of said analog signals corresponding to the existence of fricative-like, vowel-like, and silence intervals during each of said successive time intervals;
  
  c. third means in said data structure for varying said count threshold values in response to a previous state of said analog signal frequency characteristic;
  
  d. fourth means in said data structure responsive to said time interval frequency characteristic for storing a coded signal indicative of the number of successive time intervals having an identical frequency characteristic;
  
  e. fifth means in said data structure and responsive to said fourth means for storing coded signals indicative of groups of successive time intervals with an identical frequency characteristic which exceed, in number, a value stored as a coded signal in said data structure;
  
  f. sixth means in said data structure storing coded signals indicative of recognition templates; and
  
  g. seventh means in said data structure for comparing said coded signals stored by said fifth means with said coded signals stored by said sixth means.
- View Dependent Claims (19)
- - 19. A programmed computer system as defined in claim 13, wherein said second means is also responsive to time durational threshold values stored in said data structure.

14. A method for recognizing human speech, comprising:
- identifying segments of said speech by frequency discrimination;
  
  providing hysteresis for limiting frequency discrimination identification changes for successive segments of said speech; and
  
  comparing said speech segments with speech templates for recognition.
- View Dependent Claims (15, 16)
- - 15. A method for recognizing human speech as defined in claim 14 wherein said hysteresis providing step comprises:
    - requiring that said frequency discrimination identification segments have a predetermined length.
  - 16. A method for recognizing human speech as defined in claim 15 additionally comprising:
    - changing said predetermined length in response to previously identified segments of said speech.

20. Speech recognition apparatus, comprising:
- speech responsive input means for generating an AC signal having a frequency determined by said speech;
  
  a detector for producing digital signals by comparing said AC signal with a threshold signal level;
  
  means for defining time intervals;
  
  means responsive to said digital signals and said time intervals for counting said digital signals within said time intervals;
  
  means responsive to said counting means for categorizing said speech during said time intervals as fricative-like intervals, vowel-like intervals, or silence intervals;
  
  means for counting said fricative-like intervals, vowel-like intervals, and silence intervals to generate interval counts;
  
  means responsive to said interval counts for generating a fricative-like, vowel-like, and silence state sequence for said speech, said means comprising;
  
  means for comparing said interval counts with a count threshold; and
  
  means for varying said count threshold in response to the location of a state within said state sequence;
  
  means for comparing said state sequence with plural state sequence templates to identify a match; and
  
  means for generating an output signal identifying one of said state sequence templates which is identified as an exact match.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
GTE Wireless Service Corporation (Verizon Communications Inc.)
Original Assignee
Interstate Electronics Corp. (L3Harris Technologies, Inc.)
Inventors
Hitchcock, Myron H.
Primary Examiner(s)
KEMENY, EMANUEL

Application Number

US06/259,695
Time in Patent Office

774 Days
Field of Search

179/1 SD, 179/1.5 B, 179/1 SC, 179/1 SA, 179/15.55 R, 179/15.55 T, 179/1 VC
US Class Current

704/254
CPC Class Codes

G10L 15/00 Speech recognition G10L17/0...

Speech recognition microcomputer

First Claim

8 Assignments

0 Petitions

Accused Products

Abstract

55 Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Speech recognition microcomputer

First Claim

8 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

55 Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links