Speech recognition method and system using compressed speech data
First Claim
1. A method for recognizing a spoken word, the method comprising the steps of:
- receiving compressed speech data that has been compressed using linear prediction coding (LPC) techniques;
extracting at least one set of LPC parameters from said compressed speech data without completely decompressing said compressed speech data;
calculating at least one recognition feature from said at least one set of LPC parameters; and
utilizing said at least one recognition feature and at least one previously stored recognition feature to recognize the spoken word.
5 Assignments
0 Petitions
Accused Products
Abstract
A vocoder based voice recognizer recognizes a spoken word using linear prediction coding (LPC) based, vocoder data without completely reconstructing the voice data. The recognizer generates at least one energy estimate per frame of the vocoder data and searches for word boundaries in the vocoder data using the associated energy estimates. If a word is found, the LPC word parameters are extracted from the vocoder data associated with the word and recognition features are calculated from the extracted LPC word parameters. Finally, the recognition features are matched with previously stored recognition features of other words, thereby to recognize the spoken word.
-
Citations
26 Claims
-
1. A method for recognizing a spoken word, the method comprising the steps of:
-
receiving compressed speech data that has been compressed using linear prediction coding (LPC) techniques; extracting at least one set of LPC parameters from said compressed speech data without completely decompressing said compressed speech data; calculating at least one recognition feature from said at least one set of LPC parameters; and utilizing said at least one recognition feature and at least one previously stored recognition feature to recognize the spoken word. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A method for preparing to recognize a spoken word, the method comprising the steps of:
-
receiving compressed speech data that has been compressed using linear prediction coding (LPC) techniques; extracting at least one set of LPC parameters from said compressed speech data without completely decompressing said compressed speech data; and calculating at least one recognition feature from said at least one set of LPC parameters. - View Dependent Claims (10, 11, 12)
-
-
13. A digital cellular telephone comprising:
-
a mobile telephone operating system; a vocoder which compresses a voice signal using at least linear prediction coding (LPC) thereby to produce compressed speech data; and a speech recognizer comprising; a front end processor, operating on said compressed speech data without completely decompressing said compressed speech data, which determines when a word was spoken and generates recognition features of said spoken word; and a recognition unit which at least recognizes said spoken word as one of a set of reference words. - View Dependent Claims (14, 15, 16)
-
-
17. A speech recognizer operable with compressed speech data which has been compressed using linear prediction coding (LPC) techniques by a vocoder, the speech recognizer comprising:
-
a front end processor which processes said compressed speech data without completely decompressing said compressed speech data to determine when a word was spoken and to generate recognition features of said spoken word; and a recognition unit which at least recognizes said spoken word as one of a set of reference words. - View Dependent Claims (18, 19, 20, 21, 22, 23, 24)
-
-
25. A digital cellular telephone comprising:
-
a mobile telephone operating system; a plurality of vocoders which compress a voice signal using at least linear prediction coding (LPC) thereby to produce compressed speech data, each vocoder operable with one of a corresponding plurality of vocoder types; and a speech recognizer comprising; a corresponding plurality of front end processors, one for each of said vocoder types, each said processor operable on said compressed speech data without completely decompressing said compressed speech data, which determine when a word was spoken and generate recognition features of said spoken word; and a recognition unit which at least recognizes said spoken word as one of a set of reference words. - View Dependent Claims (26)
-
Specification