×

Method and apparatus for estimating phone class probabilities a-posteriori using a decision tree

  • US 5,680,509 A
  • Filed: 09/27/1994
  • Issued: 10/21/1997
  • Est. Priority Date: 09/27/1994
  • Status: Expired due to Term
First Claim
Patent Images

1. A method of recognizing speech, comprising:

  • (a) inputting a set of training data comprising a plurality of records, each record of the training data comprising a sequence of 2K+1 feature vectors and a member of the class, each feature vector being represented by a label;

    (b) forming a binary decision tree, the tree comprising a root node and a plurality of child nodes each associated with a binary question, the tree terminating in a plurality of terminal nodes, wherein the step of forming the trees comprises;

    (i) for each index t in the sequence of feature vectors, wherein the index t refers to the tth label in the sequence of 2K+1 labels in a training record, dividing the labels at each of indexes t-K, . . . , t, . . . t+K, into pairs of sets, respectively, wherein the labels at each of the indexes are divided so as to minimize entropy of the classes associated with the pairs of sets;

    (ii) selecting from the pairs of sets a lowest entropy pair;

    (iii) generating a binary question and assigning it to the node, wherein the question asks whether a label to be classified, occurring at index T corresponding to the index of the lowest entropy pair, is a member of the first set or the second set;

    (c) partitioning the data at the current node into two child nodes in accordance with this question;

    (d) repeating steps (b)(i)-(b)(iii) for each child node;

    (e) for each child node, computing a probability distribution of the occurrence of the class members, given the members of the set of labels at that node;

    inputting a sequence of speech to be recognized;

    traversing the binary decision tree for every time frame of an input sequence of speech to determine a distribution of most likely phones for each time frame, the most likely phones for each time frame collectively forming a phone sequence;

    outputting a recognition result based upon the distribution of most likely phones.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×