Speech recognition microcomputer
First Claim
1. Speech recognition apparatus, comprising:
- speech responsive input means for generating an AC signal having a frequency determined by said speech;
a detector for producing digital signals by comparing said AC signal with a threshold signal level;
means for defining time intervals;
means responsive to said digital signals and said time intervals for counting said digital signals within said time intervals;
means responsive to said counting means for comparing the output of said digital signal counting means with plural count thresholds and for thereby categorizing said speech during said time intervals as fricative-like intervals, vowel-like intervals, or silence intervals;
means for counting said fricative-like intervals, vowel-like intervals, and silence intervals to generate interval counts;
means responsive to said interval counts for generating a fricative-like, vowel-like, and silence state sequence for said speech;
means for varying at least one of said plural count thresholds in response to the previous state in said state sequence for said speech;
means for comparing said state sequence with plural state sequence templates to identify a match; and
means for generating an output signal identifying one of said state sequence templates which is identified as an exact match.
8 Assignments
0 Petitions
Accused Products
Abstract
A simplified, speaker independent, selected vocabulary, word recognizing microcomputer functions without the use of a typical front end filtering network. The microcomputer identifies vowel-like fricative-like, and silence signal states within a word or phrase by counting speech pattern zero crossings during sequential time periods. Variable zero crossing count thresholds are used to identity states based upon previously identified states, and histeresis is provided, through the use of state time measurement, to prevent state oscillations which would result in erroneous state sequences. The microcomputer, by monitoring zero crossings, defines words as a sequence of vowel-like, fricative-like, and silence states. By limiting the recognizable vocabulary to words which have dissimilar sequences, the incoming speech pattern may be recognized by comparison with state templates defining the limited vocabulary stored in the microcomputer'"'"'s memory.
55 Citations
20 Claims
-
1. Speech recognition apparatus, comprising:
-
speech responsive input means for generating an AC signal having a frequency determined by said speech; a detector for producing digital signals by comparing said AC signal with a threshold signal level; means for defining time intervals; means responsive to said digital signals and said time intervals for counting said digital signals within said time intervals; means responsive to said counting means for comparing the output of said digital signal counting means with plural count thresholds and for thereby categorizing said speech during said time intervals as fricative-like intervals, vowel-like intervals, or silence intervals; means for counting said fricative-like intervals, vowel-like intervals, and silence intervals to generate interval counts; means responsive to said interval counts for generating a fricative-like, vowel-like, and silence state sequence for said speech; means for varying at least one of said plural count thresholds in response to the previous state in said state sequence for said speech; means for comparing said state sequence with plural state sequence templates to identify a match; and means for generating an output signal identifying one of said state sequence templates which is identified as an exact match. - View Dependent Claims (2, 3, 4, 5, 17, 18)
-
-
6. A speech recognition system, comprising:
-
a read-only memory permanently storing speech template data for a limited vocabulary, said speech template data defining words selected to be dissimilar; means for analyzing speech input data, said means comprising; means for measuring the average speech frequency of said data; means for comparing said average speech frequency with thresholds to generate speech state sequences of fricative-like, vowel-like, and silence states; and means for changing at least one of said thresholds in response to a previous state of said speech state sequence; and means for comparing said speech state sequences with said speech template data to output a signal identifying said speech. - View Dependent Claims (7, 8)
-
-
9. A method for recognizing speech signals, comprising;
-
providing an analog electrical signal identifying the frequency content of said speech signals; first comparing said analog electrical signal with a threshold level to provide digital signals when said analog electrical signals cross said threshold level; first counting said digital signals during plural predetermined time increments to generate a digital count signal; second comparing said digital count signal with plural count thresholds to identify the average frequency content of said speech signals during said plural time increments; second counting successive ones of said plural time increments which have similar average frequency content; third comparing the number of successive ones of said plural time increments which have similar average frequency content with a predetermined count to define a state for said speech signals and to thereby provide a state sequence for said speech signals; varying said predetermined count in response to the location of said speech signal state within said state sequence; and fourth comparing said state sequence with plural stored state sequence templates to recognize said speech signals. - View Dependent Claims (10, 11, 12)
-
-
13. In a programmed computer system for recognizing human speech patterns in response to analog signals identifying the frequency of said analog signals, a data structure for comparing said speech patterns with stored templates, comprising:
-
a. first means in said data structure and responsive to said analog signals for storing first coded signals indicative of the average frequency of said analog signals during successive time intervals; b. second means in said data structure responsive to said first coded signals for comparing said first coded signals with count threshold values stored in said data structure to identify a frequency characteristic of said analog signals corresponding to the existence of fricative-like, vowel-like, and silence intervals during each of said successive time intervals; c. third means in said data structure for varying said count threshold values in response to a previous state of said analog signal frequency characteristic; d. fourth means in said data structure responsive to said time interval frequency characteristic for storing a coded signal indicative of the number of successive time intervals having an identical frequency characteristic; e. fifth means in said data structure and responsive to said fourth means for storing coded signals indicative of groups of successive time intervals with an identical frequency characteristic which exceed, in number, a value stored as a coded signal in said data structure; f. sixth means in said data structure storing coded signals indicative of recognition templates; and g. seventh means in said data structure for comparing said coded signals stored by said fifth means with said coded signals stored by said sixth means. - View Dependent Claims (19)
-
-
14. A method for recognizing human speech, comprising:
-
identifying segments of said speech by frequency discrimination; providing hysteresis for limiting frequency discrimination identification changes for successive segments of said speech; and comparing said speech segments with speech templates for recognition. - View Dependent Claims (15, 16)
-
-
20. Speech recognition apparatus, comprising:
-
speech responsive input means for generating an AC signal having a frequency determined by said speech; a detector for producing digital signals by comparing said AC signal with a threshold signal level; means for defining time intervals; means responsive to said digital signals and said time intervals for counting said digital signals within said time intervals; means responsive to said counting means for categorizing said speech during said time intervals as fricative-like intervals, vowel-like intervals, or silence intervals; means for counting said fricative-like intervals, vowel-like intervals, and silence intervals to generate interval counts; means responsive to said interval counts for generating a fricative-like, vowel-like, and silence state sequence for said speech, said means comprising; means for comparing said interval counts with a count threshold; and means for varying said count threshold in response to the location of a state within said state sequence; means for comparing said state sequence with plural state sequence templates to identify a match; and means for generating an output signal identifying one of said state sequence templates which is identified as an exact match.
-
Specification