Method for direct recognition of encoded speech data

US 6,223,157 B1
Filed: 05/07/1998
Issued: 04/24/2001
Est. Priority Date: 05/07/1998
Status: Expired due to Term

First Claim

Patent Images

1. A computer based method for direct recognition of coded speech data, the coded speech data generated by a compression algorithm, the compression algorithm generating from a digitized representation of a speech signal a data representation of a set of one or more speech component vectors, the computer-based method comprising the steps of:

a. transforming the data representation of the set of one or more speech component vectors into a corresponding data representation according to a base feature type, the transforming performed using a mapping algorithm, the mapping algorithm developed by comparing the data representation of the set of one or more speech component vectors defining a waveform with the corresponding data representation according to the base feature type defining the same waveform;

b. obtaining one or more features from the corresponding data presentation according to the base feature type; and

c. generating a recognition result in accordance with the one or more features obtained.

View all claims

8 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Digital Cellular telephony requires voice compression designed to minimize the bandwidth required for the digital cellular channel. The features used in speech recognition have similar components to those used in the vocoding process. The present invention provides a system that bypasses the de-compression or decoding phase of the vocoding and converts the digital cellular parameters directly into features that can be processed by a recognition engine. More specifically, the present invention provides a system and method for mapping a vocoded representation of parameters defining speech components, which in turn define a particular waveform, into a base feature type representation of parameters defining speech components (e.g. LPC parameters), which in turn define the same digital waveform.

Citations

21 Claims

1. A computer based method for direct recognition of coded speech data, the coded speech data generated by a compression algorithm, the compression algorithm generating from a digitized representation of a speech signal a data representation of a set of one or more speech component vectors, the computer-based method comprising the steps of:
- a. transforming the data representation of the set of one or more speech component vectors into a corresponding data representation according to a base feature type, the transforming performed using a mapping algorithm, the mapping algorithm developed by comparing the data representation of the set of one or more speech component vectors defining a waveform with the corresponding data representation according to the base feature type defining the same waveform;
  
  b. obtaining one or more features from the corresponding data presentation according to the base feature type; and
  
  c. generating a recognition result in accordance with the one or more features obtained.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The computer-based method according to claim 1 wherein the base feature type is a Linear Predictive Coding (“
    - LPC”
      
      ) vector representation.
  - 3. The computer-based method according to claim 1 wherein the compression algorithm is a Vector-Sum Excited Linear Prediction (“
    - VSELP”
      
      ) coding algorithm.
  - 4. The computer-based method according to claim 1 wherein the compression algorithm is a QSELP Quadrature-Sum Excited Linear Prediction (“
    - QSELP”
      
      ) coding algorithm.
  - 5. The computer-based method according to claim 1 wherein the compression method is a Global System for Mobile Communications (“
    - GSM”
      
      ) coding algorithm.
  - 6. The computer-based method according to claim 1 wherein the compression algorithm is a Global System for Mobile Communications—
    - Enhanced Full Rate (“
      
      GSM-EFR”
      
      ) coding algorithm.
  - 7. The computer-based method of claim 1 wherein the compression algorithm is a G.728 coding algorithm.

8. A computer-based method for direct recognition of coded speech data, the coded speech data generated by a compression algorithm, the compression algorithm generating from a digitized representation of a speech signal a set of one or more speech component vectors, the set of speech component vectors organized in a packed vector of indices and codebook for each speech component vector, the computer-based method comprising the steps of:
- a. generating a sequential data representation of each of the set of one or more speech component vectors from the packed vector and codebook for each component vector;
  
  b. transforming the sequential data representation of each of the set of one or more speech component vectors into a corresponding data representation according to a base feature type;
  
  c. obtaining one or more features from the corresponding data representation according to the base feature type of each of the set of one or more speech component vectors; and
  
  d. generating a recognition result in accordance with the one or more features obtained.
- View Dependent Claims (9)
- - 9. The computer-based method according to claim 8 wherein the base feature type is a linear predictive coding vector representation.

10. A computer-based method for direct recognition of coded speech data, the coded-speech data generated by a compression algorithm, the compression algorithm generating from a digitized representation of a speech signal one or more speech component vectors, the set of speech component vectors organized in a packed vector of indices and codebook for each speech component vector, the computer-based method comprising the steps of:
- a. generating a sequential data representation of each of the set of one or more speech component vectors from the packed vector and codebook for each component vector;
  
  b. transforming the sequential data representation of each of the set of one or more speech component vectors into a corresponding data representation according to base feature type, the transforming performed using a mapping algorithm, the mapping algorithm developed by comparing the sequential data representation of each of the set of one or more speech component vectors defining a waveform with the corresponding data representation according to the base feature type defining the same waveform;
  
  c. obtaining one or more features from the corresponding data representation according to the base feature type of each of the set of one or more speech component vectors; and
  
  d. generating a recognition result in accordance with the one or more features obtained.
- View Dependent Claims (11)
- - 11. The computer-based method according to claim 10 wherein the base feature type is a Linear Predictive Coding (“
    - LPC”
      
      ) vector representation.

12. A computer-based method for direct recognition of coded speech data, the coded speech data generated by a compression algorithm, the compression algorithm generating from a digitized representation of a speech signal a set of one or more speech component vectors, the set of speech component vectors organized in a packed vector of indices and codebook for each speech component vector, the computer-based method comprising the steps of:
- a. generating a data representation of each of the set of one or more speech component vectors according to a base feature type from the packed vector and codebook for each component vector;
  
  b. obtaining one or more features from the data representation of each of the set of one or more speech component vectors according to the base feature type; and
  
  c. generating a recognition result in accordance with the one or more features obtained.
- View Dependent Claims (13)
- - 13. The computer-based method according to claim 12 wherein the base feature type is a Linear Predictive Coding (“
    - LPC”
      
      ) vector representation.

14. A method for recognition of speech data, comprising the steps of:
- receiving a vocoded speech data signal;
  
  constructing at least one linear predictive code vector directly from the vocoded digital speech data signal;
  
  determining at least one speech feature as a function of the at least one linear predictive code vector;
  
  providing the at least one speech feature to a recognition engine; and
  
  recognizing speech data by the recognition engine as a function of the at least one speech feature.

15. A method for recognition of speech data, comprising the steps of:
- receiving a vocoded speech data signal, the vocoded speech data signal generated by compressing and coding a speech waveform;
  
  transforming the vocoded speech data signal to at least one linear predictive code vector without reconstructing the speech waveform;
  
  determining at least one speech feature as a function of the at least one linear predictive code vector;
  
  providing the at least one speech feature to a recognition engine; and
  
  recognizing speech data by the recognition engine as a function of the at least one speech feature.

16. A method for providing a subscriber with voice determined telephone dialing in a cellular telephone network, the cellular telephone network including a database of telephone records specific to the subscriber, the method comprising the steps of:
- receiving a vocoded speech data signal, the vocoded speech data signal generated by compressing and coding a speech waveform;
  
  transforming the vocoded speech data signal to at least one linear predictive code vector without reconstructing the speech waveform;
  
  determining at least one speech feature as a function of the at least one linear predictive code vector;
  
  providing the at least one speech feature to a recognition engine;
  
  recognizing speech data by the recognition engine as a function of the at least one speech feature;
  
  searching the telephone records specific to the subscriber for a record with data matching the speech data; and
  
  connecting a telephone call in accordance with a telephone number found in a record with data matching the speech data.
- View Dependent Claims (17)
- - 17. The method of claim 16 wherein the cellular telephone network is a Global System for Mobile Communications network.

18. A method for executing at an Internet site voice commands of a user at a remote Internet workstation, the method comprising the steps of:
- receiving a speech data signal at the Internet workstation;
  
  digitizing the speech data signal;
  
  vocoding the digitized speech data signal;
  
  transmitting the vocoded speech data signal to the Internet site;
  
  receiving the vocoded speech data signal at the Internet site;
  
  transforming the vocoded speech data signal to at least one linear predictive code vector without reconstructing the speech waveform;
  
  determining at least one speech feature as a function of the at least one linear predictive code vector;
  
  providing the at least one speech feature to a recognition engine;
  
  determining by the recognition engine speech data as a function of the at least one speech feature; and
  
  executing a command at the Internet site in accordance with the speech data.
- View Dependent Claims (19, 20, 21)
- - 19. The method of claim 18 wherein the Internet site is an Internet host.
  - 20. The method of claim 18 wherein the Internet site is an interior gateway.
  - 21. The method of claim 18 wherein the Internet site is an exterior gateway.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Google LLC (Alphabet Inc.)
Original Assignee
Dsc Telecom L.P.
Inventors
Spiess, Jeffery J., Fisher, Thomas D., Mowry, Dearborn R.
Primary Examiner(s)
Dorvil, Richemond

Application Number

US09/074,726
Time in Patent Office

1,083 Days
Field of Search

704/200, 704/231, 704/270, 704/275, 704/219, 704/223, 704/221, 704/250, 704/255, 704/246, 704/251
US Class Current

704/250
CPC Class Codes

G10L 15/02 Feature extraction for spee...

G10L 15/30 Distributed recognition, e....

Method for direct recognition of encoded speech data

First Claim

8 Assignments

0 Petitions

Accused Products

Abstract

Citations

21 Claims

Specification

Solutions

Use Cases

Quick Links

Method for direct recognition of encoded speech data

First Claim

8 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

21 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links