Speech coding apparatus and method using classification rules

US 5,522,011 A
Filed: 09/27/1993
Issued: 05/28/1996
Est. Priority Date: 09/27/1993
Status: Expired due to Term

First Claim

Patent Images

1. A speech coding apparatus comprising:

means for measuring the value of at least one feature of an utterance during each of a series of successive time intervals to produce a series of feature vector signals representing the feature values;

means for storing a plurality of prototype vector signals, each prototype vector signal having at least one parameter value and having an identification value, at least two prototype vector signals having different identification values;

classification rules means for storing classification rules mapping each feature vector signal from a set of all possible feature vector signals to exactly one of at least two different classes of prototype vector signals, each class containing a plurality of prototype vector signals and each class of prototype vector signals is at least partially different from other classes of prototype vector signals, wherein each class of prototype vector signals contains less than 1/N times the total number of prototype vector signals in all classes, where 5≦

N≦

150;

classifier means for mapping, by the classification rules, a first feature vector signal to a first class of prototype vector signals;

means for comparing the closeness of the feature value of the first feature vector signal to the parameter values of only the prototype vector signals in the first class of prototype vector signals to obtain prototype match scores for the first feature vector signal and each prototype vector signal in the first class; and

means for outputting at least the identification value of at least the prototype vector signal having the best prototype match score as a coded utterance representation signal of the first feature vector signal.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A speech coding apparatus and method uses classification rules to code an utterance while consuming fewer computing resources. The value of at least one feature of an utterance is measured during each of a series of successive time intervals to produce a series of feature vector signals representing the feature values. The classification rules comprise at least first and second sets of classification rules. The first set of classification rules map each feature vector signal from a set of all possible feature vector signals to exactly one of at least two disjoint subsets of feature vector signals. The second set of classification rules map each feature vector signal in a subset of feature vector signals to exactly one of at least two different classes of prototype vector signals. Each class contains a plurality of prototype vector signals. According to the classification rules, a first feature vector signal is mapped to a first class of prototype vector signals. The closeness of the feature value of the first feature vector signal is compared to the parameter values of only the prototype vector signals in the first class of prototype vector signals to obtain prototype match scores for the first feature vector signal and each prototype vector signal in the first class. At least the identification value of at least the prototype vector signal having the best prototype match score is output as a coded utterance representation signal of the first feature vector signal.

Citations

25 Claims

1. A speech coding apparatus comprising:
- means for measuring the value of at least one feature of an utterance during each of a series of successive time intervals to produce a series of feature vector signals representing the feature values;
  
  means for storing a plurality of prototype vector signals, each prototype vector signal having at least one parameter value and having an identification value, at least two prototype vector signals having different identification values;
  
  classification rules means for storing classification rules mapping each feature vector signal from a set of all possible feature vector signals to exactly one of at least two different classes of prototype vector signals, each class containing a plurality of prototype vector signals and each class of prototype vector signals is at least partially different from other classes of prototype vector signals, wherein each class of prototype vector signals contains less than 1/N times the total number of prototype vector signals in all classes, where 5≦
  
  N≦
  
  150;
  
  classifier means for mapping, by the classification rules, a first feature vector signal to a first class of prototype vector signals;
  
  means for comparing the closeness of the feature value of the first feature vector signal to the parameter values of only the prototype vector signals in the first class of prototype vector signals to obtain prototype match scores for the first feature vector signal and each prototype vector signal in the first class; and
  
  means for outputting at least the identification value of at least the prototype vector signal having the best prototype match score as a coded utterance representation signal of the first feature vector signal.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
- - 2. A speech coding apparatus as claimed in claim 1, characterized in that the average number of prototype vector signals in a class of prototype vector signals is approximately equal to 1/10 times the total number of prototype vector signals in all classes.
  - 3. A speech coding apparatus as claimed in claim 1, characterized in that:
    - the classification rules comprise at least first and second sets of classification rules;
      
      the first set of classification rules map each feature vector signal from a set of all possible feature vector signals to exactly one of at least two disjoint subsets of feature vector signals; and
      
      the second set of classification rules map each feature vector signal in a subset of feature vector signals to exactly one of at least two different classes of prototype vector signals, wherein the classification rules are determined by an entropy of the prototype vector signals.
  - 4. A speech coding apparatus as claimed in claim 3, characterized in that the classifier means maps, by the first set of classification rules, the first feature vector signal to a first subset of feature vector signals.
  - 5. A speech coding apparatus as claimed in claim 4, characterized in that the classifier means maps, by the second set of classification rules, the first feature vector signal from the first subset of feature vector signals to the first class of prototype vector signals.
  - 6. A speech coding apparatus as claimed in claim 4, characterized in that:
    - the second set of classification rules comprises at least third and fourth sets of classification rules;
      
      the third set of classification rules map each feature vector signal from a subset of feature vector signals to exactly one of at least two disjoint sub-subsets of feature vector signals; and
      
      the fourth set of classification rules map each feature vector signal in a sub-subset of feature vector signals to exactly one of at least two different classes of prototype vector signals.
  - 7. A speech coding apparatus as claimed in claim 6, characterized in that the classifier means maps, by the third set of classification rules, the first feature vector signal from the first subset of feature vector signals to a first sub-subset of feature vector signals.
  - 8. A speech coding apparatus as claimed in claim 7, characterized in that the classifier means maps, by the fourth set of classification rules, the first feature vector signal from the first sub-subset of feature vector signals to the first class of prototype vector signals.
  - 9. A speech coding apparatus as claimed in claim 8, characterized in that the classification rules comprise:
    - at least one scalar function mapping the feature values of a feature vector signal to a scalar value; and
      
      at least one rule mapping feature vector signals whose scalar function is less than a threshold to the first subset of feature vector signals, and mapping feature vector signals whose scalar function is greater than the threshold to a second subset of feature vector signals different from the first subset.
  - 10. A speech coding apparatus as claimed in claim 9, characterized in that:
    - the measuring means measures the values of at least two features of an utterance during each of a series of successive time intervals to produce a series of feature vector signals representing the feature values; and
      
      the scalar function of a feature vector signal comprises the value of only a single feature of the feature vector signal.
  - 11. A speech coding apparatus as claimed in claim 10, characterized in that the measuring means comprises a microphone.
  - 12. A speech coding apparatus as claimed in claim 11, characterized in that the measuring means comprises a spectrum analyzer for measuring the amplitudes of the utterance in two or more frequency bands during each of a series of successive time intervals.

13. A speech coding apparatus comprising:
- means for measuring the value of at least one feature of an utterance during each of a series of successive time intervals to produce a series of feature vector signals representing feature values;
  
  means for storing a plurality of prototype vector signals, each prototype vector signal having at least one parameter value and having an identification value, at least two prototype vector signals having different identification values;
  
  classification rules means for storing classification rules mapping each feature vector signal from a set of all possible feature vector signals to exactly one of at least two different classes of prototype vector signals, each class containing a plurality of prototype vector signals;
  
  classifier means for mapping, by the classification rules, a first feature vector signal to a first class of prototype vector signals;
  
  means for comparing the closeness of the feature value of the first feature vector signal to the parameter values of only the prototype vector signals in the first class prototype vector signals to obtain prototype match scores for the first feature vector signal and each prototype vector signal in the first class, wherein the closeness of the feature vector signal to the prototype vector signal is one of a Euclidian distance and a Gaussian distance; and
  
  means for outputting at least the identification value of at least the prototype vector signal having the best prototype match score as a coded utterance representation signal of the first feature vector signal.

14. A speech coding method comprising the steps of:
- measuring the value of at least one feature of an utterance during each of a series of successive time intervals to produce a series of feature vector signals representing the feature values;
  
  storing a plurality of prototype vector signals, each prototype vector signal having at least one parameter value and having an identification value, at least two prototype vector signals having different identification values;
  
  storing classification rules mapping each feature vector signal from a set of all possible feature vector signals to exactly one of at least two different classes of prototype vector signals, each class containing a plurality of prototype vector signals and each class of prototype vector signals is at least partially different from other classes of prototype vector signals, wherein each class of prototype vector signals contains less than 1/N times the total number of prototype vector signals in all classes, where 5≦
  
  N≦
  
  150;
  
  mapping, by the classification rules, a first feature vector signal to a first class of prototype vector signals;
  
  comparing the closeness of the feature value of the first feature vector signal to the parameter values of only the prototype vector signals in the first class of prototype vector signals to obtain prototype match scores for the first feature vector signal and each prototype vector signal in the first class; and
  
  outputting at least the identification value of at least the prototype vector signal having the best prototype match score as a coded utterance representation signal of the first feature vector signal.
- View Dependent Claims (15, 16, 17, 18, 19, 20, 21, 22, 23, 24)
- - 15. A speech coding method as claimed in claim 14, characterized in that the average number of prototype vector signals in a class of prototype vector signals is approximately equal to 1/10 times the total number of prototype vector signals in all classes.
  - 16. A speech coding method as claimed in claim 14, characterized in that:
    - the classification rules comprise at least first and second sets of classification rules;
      
      the first set of classification rules map each feature vector signal from a set of all possible feature vector signals to exactly one of at least two disjoint subsets of feature vector signals; and
      
      the second set of classification rules map each feature vector signal in a subset of feature vector signals to exactly one of at least two different classes of prototype vector signals, wherein the classification rules are determined by an entropy of the prototype vector signals.
  - 17. A speech coding method as claimed in claim 16, characterized in that the step of mapping comprises mapping, by the first set of classification rules, the first feature vector signal to a first subset of feature vector signals.
  - 18. A speech coding method as claimed in claim 17, characterized in that the step of mapping comprises mapping, by the second set of classification rules, the first feature vector signal from the first subset of feature vector signals to the first class of prototype vector signals.
  - 19. A speech coding method as claimed in claim 17, characterized in that:
    - the second set of classification rules comprises at least third and fourth sets of classification rules;
      
      the third set of classification rules map each feature vector signal from a subset of feature vector signals to exactly one of at least two disjoint sub-subsets of feature vector signals; and
      
      the fourth set of classification rules map each feature vector signal in a sub-subset of feature vector signals to exactly one of at least two different classes of prototype vector signals.
  - 20. A speech coding method as claimed in claim 19, characterized in that the step of mapping comprises mapping by the third set of classification rules, the first feature vector signal from the first subset of feature vector signals to a first sub-subset of feature vectors signals.
  - 21. A speech coding method as claimed in claim 20, characterized in that the classifier means maps, by the fourth set of classification rules, the first feature vector signal from the first sub-subset of feature vector signals to the first class of prototype vector signals.
  - 22. A speech coding method as claimed in claim 21, characterized in that the classification rules comprise:
    - at least one scalar function mapping the feature values of a feature vector signal to a scalar value; and
      
      at least one rule mapping feature vector signals whose scalar function is less than a threshold to the first subset of feature vector signals, and mapping feature vector signals whose scalar function is greater than the threshold to a second subset of feature vector signals different from the first subset.
  - 23. A speech coding method as claimed in claim 22, characterized in that:
    - the step of measuring comprising measuring the values of at least two features of an utterance during each of a series of successive time intervals to produce a series of feature vector signals representing the feature values; and
      
      the scalar function of a feature vector signal comprises the value of only a single feature of the feature vector signal.
  - 24. A speech coding method as claimed in claim 23, characterized in that the step of measuring comprises measuring the amplitudes of the utterance in two or more frequency bands during each of a series of successive time intervals.

25. A speech coding method comprising the steps of:
- measuring the value of at least one feature of an utterance during each of a series of successive time intervals to produce a series of feature vector signals representing the feature values;
  
  storing a plurality of prototype vector signals, each prototype vector signal having at least one parameter vector and having an identification value, at least two prototype vector signals having different identification values;
  
  storing classification rules mapping each feature vector from a set of all possible feature vectors to exactly one of at least two different classes of prototype vector signals, each class containing a plurality of prototype vector signals;
  
  mapping, by the classification rules, a first feature vector signal to a first class of prototype vector signals;
  
  comparing the closeness of the feature vector to the first feature vector signal to the parameter vectors of only the prototype vector signals in the first class of prototype vector signals to obtain prototype match scores for the first feature vector signal and each prototype vector signal in the first class, wherein the comparing step includes comparing the closeness of the feature vector signal to the prototype vector signal using is one of a Euclidian distance and a Gaussian distance; and
  
  outputting at least the identification value of at least the prototype vector signal having the best prototype match score as a coded utterance representation signal of the first feature vector signal.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Nuance Communications, Inc. (Microsoft Corporation)
Original Assignee
International Business Machines Corporation
Inventors
Sedivy, Jan, Nahamoo, David, Picheny, Michael A., Epstein, Mark E., Gopalakrishnan, Ponani S.
Primary Examiner(s)
MacDonald, Allen R.
Assistant Examiner(s)
ONKA, THOMAS

Application Number

US08/127,392
Time in Patent Office

974 Days
Field of Search

395/2.54, 395/2.31, 381/36
US Class Current

704/222
CPC Class Codes

G10L 19/038 Vector quantisation, e.g. T...

Speech coding apparatus and method using classification rules

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

25 Claims

Specification

Solutions

Use Cases

Quick Links

Speech coding apparatus and method using classification rules

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

25 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links