Decision tree training in machine learning
First Claim
1. A machine learning device comprising:
- a communications interface arranged to receive training data;
a tree training logic arranged to train a random decision forest using the received training data and on a basis of uncertainty measures taken at a plurality of split nodes of at least some of the received training data computed using an uncertainty measurement logic;
the tree training logic further arranged to train the random decision forest at least on a basis of a bias-corrected Gini index; and
the uncertainty measurement logic arranged to calculate a measure of uncertainty which is bias corrected in case of classification tasks or which uses a non-parametric estimate of the uncertainty in case of regression.
3 Assignments
0 Petitions
Accused Products
Abstract
Improved decision tree training in machine learning is described, for example, for automated classification of body organs in medical images or for detection of body joint positions in depth images. In various embodiments, improved estimates of uncertainty are used when training random decision forests for machine learning tasks in order to give improved accuracy of predictions and fewer errors. In examples, bias corrected estimates of entropy or Gini index are used or non-parametric estimates of differential entropy. In examples, resulting trained random decision forests are better able to perform classification or regression tasks for a variety of applications without undue increase in computational load.
29 Citations
20 Claims
-
1. A machine learning device comprising:
-
a communications interface arranged to receive training data; a tree training logic arranged to train a random decision forest using the received training data and on a basis of uncertainty measures taken at a plurality of split nodes of at least some of the received training data computed using an uncertainty measurement logic; the tree training logic further arranged to train the random decision forest at least on a basis of a bias-corrected Gini index; and the uncertainty measurement logic arranged to calculate a measure of uncertainty which is bias corrected in case of classification tasks or which uses a non-parametric estimate of the uncertainty in case of regression. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A machine learning method comprising:
-
receiving training data at a communications interface; training, at a processor, a random decision forest using the received training data and on a basis of a measure of uncertainty taken at at least one split node of at least some of the received training data, the measure of uncertainty calculated based at least in part on bootstrap resampling; further training the random decision forest at least on a basis of a bias-corrected Gini index; and computing, at the processor, the measure of the uncertainty so as to either correct for bias in the measurement of the uncertainty or to use a non-parametric estimate of the uncertainty. - View Dependent Claims (11, 12, 13, 14, 15, 16)
-
-
17. A machine learning method comprising:
-
receiving training data at a communications interface the training data comprising examples of data to be classified into one of a plurality of possible classes; training, at a processor, a random decision forest to classify data into the possible classes, the training carried out using the received training data and on the basis of a measure of uncertainty taken at at least one split node of at least some of the received training data; computing, at the processor, the measure of the uncertainty so as to use a non-parametric estimate of the uncertainty, the non-parametric measure of the uncertainty based at least in part on a one-nearest neighbor estimator of differential entropy, calculation of the entropy comprising a sum over the training data of a logarithm of Euclidean distance, and bias-correcting the measure of uncertainty and where a number of possible classes is such that it is difficult to estimate empirical class frequencies reliably. - View Dependent Claims (18, 19, 20)
-
Specification