Method for direct recognition of encoded speech data
First Claim
1. A computer based method for direct recognition of coded speech data, the coded speech data generated by a compression algorithm, the compression algorithm generating from a digitized representation of a speech signal a data representation of a set of one or more speech component vectors, the computer-based method comprising the steps of:
- a. transforming the data representation of the set of one or more speech component vectors into a corresponding data representation according to a base feature type, the transforming performed using a mapping algorithm, the mapping algorithm developed by comparing the data representation of the set of one or more speech component vectors defining a waveform with the corresponding data representation according to the base feature type defining the same waveform;
b. obtaining one or more features from the corresponding data presentation according to the base feature type; and
c. generating a recognition result in accordance with the one or more features obtained.
8 Assignments
0 Petitions
Accused Products
Abstract
Digital Cellular telephony requires voice compression designed to minimize the bandwidth required for the digital cellular channel. The features used in speech recognition have similar components to those used in the vocoding process. The present invention provides a system that bypasses the de-compression or decoding phase of the vocoding and converts the digital cellular parameters directly into features that can be processed by a recognition engine. More specifically, the present invention provides a system and method for mapping a vocoded representation of parameters defining speech components, which in turn define a particular waveform, into a base feature type representation of parameters defining speech components (e.g. LPC parameters), which in turn define the same digital waveform.
-
Citations
21 Claims
-
1. A computer based method for direct recognition of coded speech data, the coded speech data generated by a compression algorithm, the compression algorithm generating from a digitized representation of a speech signal a data representation of a set of one or more speech component vectors, the computer-based method comprising the steps of:
-
a. transforming the data representation of the set of one or more speech component vectors into a corresponding data representation according to a base feature type, the transforming performed using a mapping algorithm, the mapping algorithm developed by comparing the data representation of the set of one or more speech component vectors defining a waveform with the corresponding data representation according to the base feature type defining the same waveform;
b. obtaining one or more features from the corresponding data presentation according to the base feature type; and
c. generating a recognition result in accordance with the one or more features obtained. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A computer-based method for direct recognition of coded speech data, the coded speech data generated by a compression algorithm, the compression algorithm generating from a digitized representation of a speech signal a set of one or more speech component vectors, the set of speech component vectors organized in a packed vector of indices and codebook for each speech component vector, the computer-based method comprising the steps of:
-
a. generating a sequential data representation of each of the set of one or more speech component vectors from the packed vector and codebook for each component vector;
b. transforming the sequential data representation of each of the set of one or more speech component vectors into a corresponding data representation according to a base feature type;
c. obtaining one or more features from the corresponding data representation according to the base feature type of each of the set of one or more speech component vectors; and
d. generating a recognition result in accordance with the one or more features obtained. - View Dependent Claims (9)
-
-
10. A computer-based method for direct recognition of coded speech data, the coded-speech data generated by a compression algorithm, the compression algorithm generating from a digitized representation of a speech signal one or more speech component vectors, the set of speech component vectors organized in a packed vector of indices and codebook for each speech component vector, the computer-based method comprising the steps of:
-
a. generating a sequential data representation of each of the set of one or more speech component vectors from the packed vector and codebook for each component vector;
b. transforming the sequential data representation of each of the set of one or more speech component vectors into a corresponding data representation according to base feature type, the transforming performed using a mapping algorithm, the mapping algorithm developed by comparing the sequential data representation of each of the set of one or more speech component vectors defining a waveform with the corresponding data representation according to the base feature type defining the same waveform;
c. obtaining one or more features from the corresponding data representation according to the base feature type of each of the set of one or more speech component vectors; and
d. generating a recognition result in accordance with the one or more features obtained. - View Dependent Claims (11)
-
-
12. A computer-based method for direct recognition of coded speech data, the coded speech data generated by a compression algorithm, the compression algorithm generating from a digitized representation of a speech signal a set of one or more speech component vectors, the set of speech component vectors organized in a packed vector of indices and codebook for each speech component vector, the computer-based method comprising the steps of:
-
a. generating a data representation of each of the set of one or more speech component vectors according to a base feature type from the packed vector and codebook for each component vector;
b. obtaining one or more features from the data representation of each of the set of one or more speech component vectors according to the base feature type; and
c. generating a recognition result in accordance with the one or more features obtained. - View Dependent Claims (13)
-
-
14. A method for recognition of speech data, comprising the steps of:
-
receiving a vocoded speech data signal;
constructing at least one linear predictive code vector directly from the vocoded digital speech data signal;
determining at least one speech feature as a function of the at least one linear predictive code vector;
providing the at least one speech feature to a recognition engine; and
recognizing speech data by the recognition engine as a function of the at least one speech feature.
-
-
15. A method for recognition of speech data, comprising the steps of:
-
receiving a vocoded speech data signal, the vocoded speech data signal generated by compressing and coding a speech waveform;
transforming the vocoded speech data signal to at least one linear predictive code vector without reconstructing the speech waveform;
determining at least one speech feature as a function of the at least one linear predictive code vector;
providing the at least one speech feature to a recognition engine; and
recognizing speech data by the recognition engine as a function of the at least one speech feature.
-
-
16. A method for providing a subscriber with voice determined telephone dialing in a cellular telephone network, the cellular telephone network including a database of telephone records specific to the subscriber, the method comprising the steps of:
-
receiving a vocoded speech data signal, the vocoded speech data signal generated by compressing and coding a speech waveform;
transforming the vocoded speech data signal to at least one linear predictive code vector without reconstructing the speech waveform;
determining at least one speech feature as a function of the at least one linear predictive code vector;
providing the at least one speech feature to a recognition engine;
recognizing speech data by the recognition engine as a function of the at least one speech feature;
searching the telephone records specific to the subscriber for a record with data matching the speech data; and
connecting a telephone call in accordance with a telephone number found in a record with data matching the speech data. - View Dependent Claims (17)
-
-
18. A method for executing at an Internet site voice commands of a user at a remote Internet workstation, the method comprising the steps of:
-
receiving a speech data signal at the Internet workstation;
digitizing the speech data signal;
vocoding the digitized speech data signal;
transmitting the vocoded speech data signal to the Internet site;
receiving the vocoded speech data signal at the Internet site;
transforming the vocoded speech data signal to at least one linear predictive code vector without reconstructing the speech waveform;
determining at least one speech feature as a function of the at least one linear predictive code vector;
providing the at least one speech feature to a recognition engine;
determining by the recognition engine speech data as a function of the at least one speech feature; and
executing a command at the Internet site in accordance with the speech data. - View Dependent Claims (19, 20, 21)
-
Specification