System and method for speech recognition
First Claim
1. A method for producing a signal representing a phoneme sound contained in a stream of voice signals, said method comprising the steps of:
- (a) producing a first sequence of analog speech signals representing said voice signals;
(b) delta modulating said first sequence of analog speech signals to produce a sequence of digital pulses representing phonemic information contained in said analog speech signals;
(c) operating upon said sequence of digital pulses to detect major slope transitions of said analog speech signals;
(d) measuring time intervals between predetermined ones of said detected major slope transitions;
(e) computing a plurality of speech waveform characteristic ratios between predetermined ones of said time intervals;
(f) comparing said speech waveform characteristic ratios with a plurality of stored phoneme ratios to determine if said speech waveform characteristic ratios match any of said stored phoneme characteristic ratios; and
(g) producing a signal representing a phoneme sound corresponding to a matching one of said phoneme characteristic ratios.
1 Assignment
0 Petitions
Accused Products
Abstract
A system and method for speech recognition provides a means of printing phonemes in response to received speech signals utilizing inexpensive components. The speech signals are inputted into an amplifier which provides negative feedback to normalize the amplitude of the speech signals. The normalized speech signals are delta modulated at a first sampling rate to produce a corresponding first sequence of digital pulses. The negative feedback signal of the amplifier is delta modulated at a second sampling rate to produce a second sequence of digital pulses corresponding to amplitude information of the speech signals. The speech signals are filtered and utilized to produce a digital pulse corresponding to high frequency components of the speech signals having magnitudes in excess of a threshold voltage. A microprocessor contains an algorithm for detecting major slope transitions of the analog speech signals in response to the first sequence of digital signals by detecting information corresponding to presence and absence of predetermined numbers of successive slope reversals in the delta modulator producing the first sequence of digital pulses. The algorithm computes cues from the high frequency digital pulse and the second sequence of pulses. The algorithm computes a plurality of speech waveform characteristic ratios of time intervals between various slope transitions and compares the speech waveform characteristic ratios with a plurality of stored phoneme ratios representing a set of phonemes to detect matching therebetween. The order of comparing is determined on the basis of the cues and a configuration of a phoneme decision tree contained in the algorithm. When a matching occurs, a signal corresponding to the matched phoneme is produced and utilized to cause the phoneme to be printed. In one embodiment of the invention, the speech signals are produced by the earphone of a standard telephone headset.
68 Citations
34 Claims
-
1. A method for producing a signal representing a phoneme sound contained in a stream of voice signals, said method comprising the steps of:
-
(a) producing a first sequence of analog speech signals representing said voice signals; (b) delta modulating said first sequence of analog speech signals to produce a sequence of digital pulses representing phonemic information contained in said analog speech signals; (c) operating upon said sequence of digital pulses to detect major slope transitions of said analog speech signals; (d) measuring time intervals between predetermined ones of said detected major slope transitions; (e) computing a plurality of speech waveform characteristic ratios between predetermined ones of said time intervals; (f) comparing said speech waveform characteristic ratios with a plurality of stored phoneme ratios to determine if said speech waveform characteristic ratios match any of said stored phoneme characteristic ratios; and (g) producing a signal representing a phoneme sound corresponding to a matching one of said phoneme characteristic ratios. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16)
-
-
17. A method for producing a signal representing a phoneme in response to a sequence of analog speech signals, said method comprising the steps of:
-
(a) serially encoding said sequence of analog speech signals to produce a corresponding sequence of serial digital pulses corresponding to ramp slope reversals of a delta modulator circuit; (b) detecting major slope transitions of said analog speech signals by detecting presence and absence of predetermined numbers of successive slope reversals corresponding to said sequence of digital pulses; (c) computing a speech waveform characteristic ratio of time intervals between certain ones of said slope transitions; (d) comparing said speech waveform characteristic ratio with a stored phoneme ratio to determine if said speech waveform characteristic ratio matches said stored phoneme ratio; and (e) producing a signal representing a phoneme corresponding to said phoneme ratio if said matching occurs.
-
-
18. A system for producing signals representing a phoneme contained in a stream of voice signals, said system comprising in combination:
-
(a) means for producing a first sequence of analog speech signals representing said voice signals; (b) means for delta modulating said first sequence of analog speech signals to produce a sequence of digital pulses representing phonemic information contained in said analog speech signals; (c) means for detecting major slope transitions of said analog speech signals in response to said sequence of digital pulses; (d) means for measuring time intervals between predetermined ones of said detected major slope transitions; (e) means for computing a plurality of speech waveform characteristic ratios between predetermined ones of said time intervals; (f) means for comparing said speech waveform characteristic ratios with a plurality of stored phoneme ratios to determine if said speech waveform characteristic ratios match any of said stored phoneme characteristic ratios; and (g) means responsive to said comparing means for producing a signal representing a phoneme corresponding to a matching one of said phoneme characteristic ratios. - View Dependent Claims (19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 31, 32, 33, 34)
-
-
29. A system for producing a signal representing a phoneme in response to a sequence of analog speech signals, said system comprising in combination:
-
(a) means for serially encoding said sequence of analog speech signals to produce a corresponding sequence of serial digital pulses corresponding to ramp slope reversals of a delta modulator circuit; (b) means for detecting major slope transitions of said analog speech signals by detecting presence and absence of predetermined numbers of successive slope reversals corresponding to said sequence of digital pulses; (c) means for computing a speech waveform characteristic ratio of time intervals between certain ones of said slope transitions; (d) means for comparing said speech waveform characteristic ratio with a stored phoneme ratio to determine if said speech waveform characteristic ratio matches said stored phoneme ratio; and (e) means responsive to said comparing means for producing a signal representing a phoneme corresponding to said phoneme ratio if said matching occurs.
-
-
30. A method for producing a signal representing a phoneme in response to a sequence of speech signals, such method comprising the steps of:
-
(a) encoding said sequence of speech signals to produce a sequence of digital pulses representing phonemic information continued in said analog speech signals, said encoding including i. comparing a positively and negatively ramping signal to said sequence of substantially repetitive speech signals; ii. periodically comparing the instantaneous level of said analog speech signals at a predetermined rate with the instantaneous level of said ramping signal; iii. reversing the slope of said ramping signal if said ramping signal is positive-going and exceeds said instantaneous level, and reversing the slope of said ramping signal if said ramping signal is negative going and is less than said level; iv. producing said digital pulses in accordance with said slope reversing; (b) detecting major slope transitions of said speech wave form in response to said sequence of digital pulses by detecting presence and absence of successive slope reversals; (c) measuring time intervals between predetermined ones of said detected major slope transitions; (d) computing a plurality of speech waveform characteristic ratios between predetermined ones of said measured time intervals; (e) comparing said speech waveform characteristic ratios with a plurality of stored groups of phoneme characteristic ratios in a predetermined order to determine if said speech waveform characteristic ratios match any of said stored phoneme ratios; and (f) producing a signal representing the phoneme sound corresponding to a matching one of said phoneme ratios.
-
Specification