Machine learning method

US 6,532,305 B1
Filed: 08/05/1999
Issued: 03/11/2003
Est. Priority Date: 08/04/1998
Status: Expired due to Fees

First Claim

Patent Images

1. A method utilizing machine-readable data storage of a set of machine-executable instructions for using a data processing system to perform a method of machine learning in which the learning is an assignment of a feature vector to a classification carried out by recursively creating and using first, a binary classifier tree having nodes, which nodes further comprise branch nodes and leaf nodes, and second, an Bayesian classifier and the method comprising the steps of:

a. using a binary tree classifier to create a node, wherein the node comprises training data having a dataset size, the training data comprising multiple sets of feature vectors and corresponding classifications;

b. hypothesizing the node just constructed is a leaf node;

c. constructing a Bayesian leaf node classifier for the node;

d. testing the hypothesis by applying the leaf node classifier against the training data and if the hypothesis is correct then the node is a leaf node, if the hypothesis fails, then the node is a branch node;

e. creating, for each node determined to be a branch node, a branch node hyperplane comprising a hyperplane point and a hyperplane normal;

f. splitting for each node determined to be a branch node, the training data for the branch node into two subsets, a left subset and a right subset, according to which side of the branch node hyperplane each element of training data resides; and

g. passing the subsets to the next nodes recursively to create a tree.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method of machine learning utilizing binary classification trees combined with Bayesian classifiers in which a training component includes creating nodes, utilizing the expectation maximization algorithm to create statistical kernels and mixtures for leaf nodes and kappa statistics to identify nodes as leaf nodes or branch nodes, creating hyperplanes for branch nodes, splitting the training data for branch nodes into subsets to be provided to subtrees, and in which an operating component traverses a binary classifier tree with a feature vector to be classified by determining for each branch node whether the feature vector lies to the left or the right of the branch node hyperplane and finally classifies the feature vector upon arriving at a leaf node by computing log likelihoods for the feature vector for each mixture in the leaf node and determining the classification of the feature vector according to the highest log likelihood, in which the preferred embodiment described is classification of internal combustion engine cylinder firings as nominal firings or misfires.

47 Citations

View as Search Results

8 Claims

1. A method utilizing machine-readable data storage of a set of machine-executable instructions for using a data processing system to perform a method of machine learning in which the learning is an assignment of a feature vector to a classification carried out by recursively creating and using first, a binary classifier tree having nodes, which nodes further comprise branch nodes and leaf nodes, and second, an Bayesian classifier and the method comprising the steps of:
- a. using a binary tree classifier to create a node, wherein the node comprises training data having a dataset size, the training data comprising multiple sets of feature vectors and corresponding classifications;
  
  b. hypothesizing the node just constructed is a leaf node;
  
  c. constructing a Bayesian leaf node classifier for the node;
  
  d. testing the hypothesis by applying the leaf node classifier against the training data and if the hypothesis is correct then the node is a leaf node, if the hypothesis fails, then the node is a branch node;
  
  e. creating, for each node determined to be a branch node, a branch node hyperplane comprising a hyperplane point and a hyperplane normal;
  
  f. splitting for each node determined to be a branch node, the training data for the branch node into two subsets, a left subset and a right subset, according to which side of the branch node hyperplane each element of training data resides; and
  
  g. passing the subsets to the next nodes recursively to create a tree.
- View Dependent Claims (2, 3, 4, 5)
- - 2. The method of claim 1 further comprising the steps of:
3. The method of claim 2 wherein the step of creating a branch node hyperplane further comprises the steps of:
- a. constructing a branch mixture comprising two kernels, a first branch mixture kernel and a second branch mixture kernel, wherein the branch mixture is constructed by using the EM algorithm with all the training data in the branch node;
  
  b. attempting to calculate the hyperplane point by identifying the point along the line connecting the centers of the two branch mixture kernels such that the kernel log likelihoods at the hyperplane point are equal, wherein this step fails if the hyperplane point does not exist;
  
  c. constructing, if the hyperplane point exists, the equivocation hyperconic section comprising a locus of points for which Mahanoblis distances between the points and the branch mixture kernels are the same;
  
  d. calculating, if the hyperplane point exists, the hyperplane normal as the normal to the equivocation hyperconic section at the hyperplane point; and
  
  e. determining the node to be a leaf node if the hyperplane point does not exist.
4. The method of claim 3 wherein the step of splitting the training data further comprises the steps of:
- a. computing for each feature vector in the training data a difference vector between the hyperplane point and the feature vector;
  
  b. calculating an inner product of the hyperplane normal and the difference vector calculated in step a;
  
  c. identifying an algebraic sign of the inner product; and
  
  d. determining the side to be left if the algebraic sign is negative and right if the algebraic sign is positive.
5. The method of claim 1 further comprising the steps of:
- a. traversing the classifier tree with a feature vector to be classified until a leaf node is reached, wherein the tree is traversed to the left or to the right at each branch node according to whether the feature vector resides to the left or the right of the branch node hypetplane;
  
  b. deciding, upon arriving at a leaf node, how to classify the feature vector, wherein the decision how to classify the feature vector is made by computing log likelihoods for the feature vector for each mixture in the leaf node classifier and classifying the feature vector in the class identified by the mixture having the highest log likelihood; and
  
  c. whereby a feature vector is classified.

6. A method utilizing machine-readable data storage of a set of machine-executable instructions for using a data processing system to perform a method of machine learning in which the learning is an assignment of a feature vector to a classification carried out by creating and using first a binary classifier tree having nodes, which nodes further comprise branch nodes and leaf nodes and second a Bayesian classifier to obtain a result with, the method comprising the steps of:
- a. determining a kappa threshold and a minimum dataset size;
  
  b. using a binary tree classifier to create a node, wherein the node comprises training data having a dataset size, the training data comprising multiple sets of feature vectors and corresponding classifications;
  
  c. assuming that the node just constructed is a leaf node;
  
  d. constructing a Bayesian leaf node classifier for all of an training data in the node by use of the Expectation Maximization “
  
  EM”
  
  algorithm;
  
  e. forming a confusion matrix comprising m²numbers, where m is the number of classes represented by mixtures in a leaf node classifier and the ij^thelement of the matrix comprises the number of training feature vectors belonging to class i and classified in class j;
  
  f. computing a kappa statistic from the numbers in the confusion matrix;
  
  g. comparing the kappa statistic to the kappa threshold;
  
  h. comparing the dataset size of the training data to the minimum dataset size; and
  
  i. determining the node to be a branch node if the dataset size of the training data is exceeds the minimum dataset size;
  
  j. creating for each node determined to be a branch node, a branch node hyperplane comprising a hyperplane point and a hyperplane normal, this step further comprising the steps of;
  
  (1) constructing a branch mixture comprising two kernels, a first branch mixture kernel and a second branch mixture kernel, wherein the branch mixture is constructed by using the EM algorithm with all the training data in the branch node;
  
  (2) calculating the hyperplane point by identifying the point along the line connecting the centers of the two branch mixture kernels such that the kernel log likelihoods at the hyperplane point are equal, wherein this step fails if the hyperplane point does not exist;
  
  (3) constructing, if the hyperplane point exists, the equivocation hyperconic section comprising a locus of points for which Mahanoblis distances between the points and the branch mixture kernels are the same; and
  
  (4) calculating the hyperplane normal as the normal to the equivocation hypereonic section at the hyperplane point;
  
  k. determining the node to be a leaf node if the hyperplane point does not exist;
  
  l. splitting the training data for the branch node into two subsets, a left subset and a right subset, according to which side of the branch node hyperplane each element of training data resides, this step comprising the further steps of;
  
  (1) computing for each feature vector in the training data a difference vector between the hyperplane point and the feature vector;
  
  (2) calculating an inner product of the hyperplane normal and the difference vector calculated in step a;
  
  (3) identifying an algebraic sign of the inner product;
  
  (4) determining the side to be left if the algebraic sign is negative and right if the algebraic sign is positive;
  
  m. passing the subsets to the next nodes recursively to be created by passing the left subset to a next left node and the right subset to a next right node;
  
  n. creating, for each node determined to be a branch node, a left node pointer and a right node pointer for pointing to a next layer of nodes recursively to be created in the binary classifier tree; and
  
  o. repeating steps b through h recursively until the application of step c or step e results in a determination that the current node is a leaf node.
- View Dependent Claims (7, 8)
- - 7. The method of claim 6 further comprising the step of determining how a feature vector is to be classified in a class identified by a mixture in the leaf node classifier, this step comprising the further steps of:
8. The method of claim 6 further comprising the step of determining how a feature vector is to be classified in a class identified by a mixture in the leaf node classifier, this step comprising the further steps of:
- a. traversing the classifier tree with a feature vector to be classified until a leaf node is reached, this step comprising the further steps of;
  
  (1) calculating an inner product of the hyperplane normal and the difference vector calculated in step a;
  
  (2) determining an algebraic sign of the inner product;
  
  (3) traversing the classifier tree to a next node, wherein the traversal is to the left if the algebraic sign is negative and to the right if the algebraic sum is positive;
  
  (4) repeating steps 8.a. (1) through 8.a. (3) until a leaf node is reached;
  
  b. deciding, upon arriving at a leaf node, how to classify the feature vector, this step comprising the further steps of;
  
  (1) Computing a mixture log likelihood for each mixture in the leaf node classifier, this step comprising the further steps of;
  
  1. computing a kernel log likelihood for the feature vector with respect to each kernel in the mixture, wherein the kernel log likelihood is computed by use of the Mahanoblis distance between the feature vector and a kernel, and a kernel size, wherein the kernel log likelihood of a feature vector x with respect to a kernel with mean m, covariance C, and dimension n is;

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Titan (L3Harris Technologies, Inc.)
Original Assignee
LinCom Corp (Titan Corporation)
Inventors
Hammen, David G.
Primary Examiner(s)
Boudreau, Leo
Assistant Examiner(s)
Mariam, Daniel G.

Application Number

US09/369,110
Time in Patent Office

1,314 Days
Field of Search

382/156-157, 382/159, 382/160, 382/173, 382/180, 382/181, 382/190, 382/224, 382/226, 382/227, 382/228, 382/155, 704/231, 704/232, 704/240, 704/254, 704/255, 706/15, 706/20, 706/25
US Class Current

382/227
CPC Class Codes

G06F 18/24323   Tree-organised classifiers

G06N 20/00   Machine learning

G06N 20/10   using kernel methods, e.g. ...

Machine learning method

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

47 Citations

8 Claims

Specification

Use Cases

Quick Links

Others

Machine learning method

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

47 Citations

8 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others