Data mining technique with experience-layered gene pool
First Claim
1. A computer-implemented data mining system, for use with a data mining training database containing training data, comprising:
- a memory storing a candidate gene database having a pool of candidate individuals, each candidate individual identifying a plurality of conditions and at least one corresponding proposed output in dependence upon the conditions, each candidate individual further having associated therewith an indication of a respective fitness estimate, and an indication of a respective testing experience level;
a gene pool processor which;
tests individuals from the candidate gene pool on the training data, each individual being tested undergoing a respective battery of at least one trial and thereby increasing the individual'"'"'s testing experience level, each trial applying the conditions of the respective individual to the training data to propose an output, andupdates the fitness estimate associated with each of the individuals being tested in dependence upon both the training data and the outputs proposed by the respective individual in the battery of trials; and
a gene harvesting module providing for deployment selected ones of the individuals from the gene pool,wherein the gene pool processor includes a competition module which selects individuals for discarding from the gene pool in dependence upon both their updated fitness estimate and their testing experience level, including a first instance of considering a subject individual for discarding in dependence upon both its fitness estimate and its testing experience level, and a second instance of considering the subject individual for discarding in dependence upon both its fitness estimate and its testing experience level, and wherein the second instance occurs after the subject individual has more testing experience than at the first instance.
3 Assignments
0 Petitions
Accused Products
Abstract
Roughly described, a computer-implemented evolutionary data mining system includes a memory storing a candidate gene database in which each candidate individual has a respective fitness estimate; a gene pool processor which tests individuals from the candidate gene pool on training data and updates the fitness estimate associated with the individuals in dependence upon the tests; and a gene harvesting module providing for deployment selected ones of the individuals from the gene pool, wherein the gene pool processor includes a competition module which selects individuals for discarding from the gene pool in dependence upon both their updated fitness estimate and their testing experience level. Preferably the gene database has an elitist pool containing multiple experience layers, and the competition module causes individuals to compete only with other individuals in their same experience layer.
58 Citations
44 Claims
-
1. A computer-implemented data mining system, for use with a data mining training database containing training data, comprising:
-
a memory storing a candidate gene database having a pool of candidate individuals, each candidate individual identifying a plurality of conditions and at least one corresponding proposed output in dependence upon the conditions, each candidate individual further having associated therewith an indication of a respective fitness estimate, and an indication of a respective testing experience level; a gene pool processor which; tests individuals from the candidate gene pool on the training data, each individual being tested undergoing a respective battery of at least one trial and thereby increasing the individual'"'"'s testing experience level, each trial applying the conditions of the respective individual to the training data to propose an output, and updates the fitness estimate associated with each of the individuals being tested in dependence upon both the training data and the outputs proposed by the respective individual in the battery of trials; and a gene harvesting module providing for deployment selected ones of the individuals from the gene pool, wherein the gene pool processor includes a competition module which selects individuals for discarding from the gene pool in dependence upon both their updated fitness estimate and their testing experience level, including a first instance of considering a subject individual for discarding in dependence upon both its fitness estimate and its testing experience level, and a second instance of considering the subject individual for discarding in dependence upon both its fitness estimate and its testing experience level, and wherein the second instance occurs after the subject individual has more testing experience than at the first instance. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26)
-
-
27. A computer-implemented data mining method, for use with a data mining training database containing training data, comprising the steps of:
-
providing a computer system having a memory having a candidate gene database identifying a pool of candidate individuals, each candidate individual identifying a plurality of conditions and at least one corresponding proposed output in dependence upon the conditions, each candidate individual further having associated therewith an indication of a respective fitness estimate, each candidate individual further having associated therewith an indication of a respective testing experience level; providing in the memory data identifying layer parameters for each of a plurality of gene pool experience layers L1-LT in an elitist pool, T>
1, the layer parameters for each i'"'"'th one of the layers L1-LT−
1 including a gene capacity Quota(Li) and a range of testing experience [ExpMin(Li) . . . ExpMax(Li)], the layer parameters for experience layer LT including a gene capacity Quota(LT) and a minimum testing experience level ExpMin(LT), each ExpMin(Li)>
ExpMax(Li−
1) for i>
1;testing on the training data each individual in a testing subset of at least one of the candidate individuals, each individual in the testing subset undergoing a respective battery of at least one trial, each trial applying the conditions of the respective individual to the training data to propose a result; calculating a fitness of each of the candidate individuals in the testing subset in dependence upon the training data and the results proposed by the individual in the step of testing; for each j'"'"'th one of the layers in the elitist pool, the computer system discarding all individuals in the elitist pool which are not among the Quota(Lj) fittest individuals whose testing experience level is in the range [ExpMin(Lj) . . . ExpMax(Lj)]; and providing for deployment selected ones of the remaining individuals from the plurality of candidate individuals. - View Dependent Claims (28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44)
-
Specification