Speech-controlled phonetic typewriter or display device using two-tier approach
First Claim
1. A two-tier method of converting an audio input, comprising words maade up of various sounds in a spoken sequence, into a visible form, comprising a sequence of corresponding phonemes, said method comprising the steps of:
- (a) breaking down the spoken sequence of sounds into syllabits, each syllabit comprising a group of classes of sounds;
(b) grouping the syllabits into syllabit groups, each syllabit group defining corresponding possible words;
(c) providing, for each of said possible words corresponding to each syllabit group, a respective skeletal sequence of phonemes comprising a corresponding grouping of phonemes;
(d) determining, for each distinctive syllabit group, the phonemes occurring therein so as to develop an input sequence of phonemes for each syllabit group;
(e) comparing the input sequence of phonemes for each syllabit group with the respective skeletal sequence of phonemes of each of the corresponding possible words so as to determine, with reference to the phonemes in each grouping of phonemes, which possible word has a skeletal sequence of phonemes which contains, in a given sequence, phonemes all of which are found, in said given sequence, in the input sequence of phonemes, thereby identifying each of said words of said audio input; and
(f) providing said identified words of said audio input in said visible form.
1 Assignment
0 Petitions
Accused Products
Abstract
A speech-controlled phonetic device utilizes a two-tier approach for converting an audio input into visual form. The device basically comprises: various components for identifying different phonemes, such as a sound separator, various sensors and transducers, a vowel scanner, a vowel transducer, and a diphthong transducer; an input synchronizer; a transcriber processor; and a printer or display device. The two-tier approach involves a first tier, wherein the identified speech sounds are broken down into syllabits (groupings of classes of sound), the spoken sequence of those syllabits is separated into possible words, and the grouping of the syllabits is indicated. The second tier involves the use of stored words with those respective groupings, but narrowed down to essential phonemes only. Thus, the second tier acts to eliminate, from such possible words, all except a specific word (the actually spoken word), which contains each of the detected phonemes in the proper sequence. Further features of the invention include a vowel identification circuit using both formant peak detection and envelope detection-comparison techniques, and the use of an input synchronizer to provide phoneme identifiers to the transcriber processor.
-
Citations
28 Claims
-
1. A two-tier method of converting an audio input, comprising words maade up of various sounds in a spoken sequence, into a visible form, comprising a sequence of corresponding phonemes, said method comprising the steps of:
-
(a) breaking down the spoken sequence of sounds into syllabits, each syllabit comprising a group of classes of sounds; (b) grouping the syllabits into syllabit groups, each syllabit group defining corresponding possible words; (c) providing, for each of said possible words corresponding to each syllabit group, a respective skeletal sequence of phonemes comprising a corresponding grouping of phonemes; (d) determining, for each distinctive syllabit group, the phonemes occurring therein so as to develop an input sequence of phonemes for each syllabit group; (e) comparing the input sequence of phonemes for each syllabit group with the respective skeletal sequence of phonemes of each of the corresponding possible words so as to determine, with reference to the phonemes in each grouping of phonemes, which possible word has a skeletal sequence of phonemes which contains, in a given sequence, phonemes all of which are found, in said given sequence, in the input sequence of phonemes, thereby identifying each of said words of said audio input; and (f) providing said identified words of said audio input in said visible form. - View Dependent Claims (2, 3, 4)
-
-
5. A two-tier system for converting an audio input, comprising words made up of various sounds in a spoken sequence, into a visible form, comprising a sequence of corresponding phonemes, said system comprising:
-
first means for breaking down the spoken sequence of sounds into syllabits, each syllabit comprising a group of classes of sounds; second means for grouping the syllabits into syllabit groups, each syllabit group defining corresponding possible words; third means for providing, for each of said possible words corresponding to each syllabit group, a respective skeletal sequence of phonemes comprising a corresponding grouping of phonemes; fourth means for determining, for each distinctive syllabit group, the phonemes occurring therein so as to develop an input sequence of phonemes for each syllabit group; fifth means for comparing the input sequence of phonemes for each syllabit group with the respective skeletal sequences of phonemes of each of the corresponding possible words so as to determine, with reference to the phonemes in each grouping of phonemes, which possible word has a skeletal sequence of phonemes which contains, in a given sequence, phonemes all of which are found, in said given sequence, in the input sequence of phonemes, thereby identifying each of said words of said audio input; and sixth means for providing said identified words of said audio input in said visible form. - View Dependent Claims (6, 7, 8, 9, 10, 11, 12, 13, 14)
-
-
15. In a system for converting an audio input, comprising words made up of various sounds in a spoken sequence, into a visible form, comprising a sequence of corresponding phonemes, said system comprising:
-
at least one transducer for receiving and processing said audio input to derive at least one phoneme identification output; and vowel identification means for receiving and processing said audio input to provide vowel identification outputs; the improvement wherein said vowel identification means comprises a vowel scanner for scanning said audio input to obtain preliminary vowel identification outputs, and a vowel transducer for receiving and processing said audio input so as to provide an enabling signal selecting one of said preliminary vowel identification outputs, whereby to provide said vowel identification outputs of said vowel identification means. - View Dependent Claims (16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 28)
-
-
26. In a system for converting an audio input, comprising words made up of various sounds in a spoken sequence, into a visible form, comprising a sequence of corresponding phonemes, said system comprising:
-
phoneme identifying means responsive to said audio input for identifying said sequence of corresponding phonemes, and processor means for receiving and processing said sequence of corresponding phonemes to provide said identified words of said audio input in said visible form; the improvement wherein said processor means breaks down the spoken sequence of sounds into syllabits, each syllabit comprising a group of classes of sounds, and wherein said processor means groups the syllabits into syllabit groups, each syllabit group defining corresponding possible words, and provides, for each of said possible words corresponding to each syllabit group, a respective skeletal sequence of phonemes comprising a corresponding grouping of phonemes. - View Dependent Claims (27)
-
Specification