Data mining technique with n-Pool evolution
First Claim
1. A computer-implemented method, for use with a testing database containing a plurality of samples of testing data, the samples being distributed among N>
- 1 segments of the testing data, each of the segments including at least one of the samples and at least one of the segments including more than one of the samples,for use further with a memory storing a candidate database having a pool of candidate individuals, each i'"'"'th one of the candidate individuals identifying a testing experience level, a fitness estimate, a rule set for applying to samples of data in the testing data, and a respective testing set TSi of the testing data segments, the method comprising;
for each i'"'"'th one of the candidate individuals;
assigning to the i'"'"'th individual the respective testing set TSi of the testing data segments,a computer system testing the i'"'"'th individual on samples of the testing data from the i'"'"'th individual'"'"'s testing set TSi of testing data segments,updating the fitness estimate associated with the i'"'"'th individual in dependence upon results of the testing, andupdating the testing experience level associated with the i'"'"'th individual in dependence upon the number of samples on which the i'"'"'th individual is tested;
selecting individuals for discarding from the candidate pool in dependence upon a competition among candidate individuals;
forming new individuals in the candidate pool by procreation in dependence upon a respective set of at least one parent individual from the candidate pool;
a computer system validating, without further procreation as part of the validating step, a plurality of evolved individuals whose testing experience level has reached a predetermined maturity level without being selected for discarding, including further testing each such individual on samples of the testing data from a testing data segment other than those in the individual'"'"'s testing set TSi; and
providing for deployment selected ones of the individuals in the plurality of evolved individuals that satisfy predetermined deployment criteria after validation,wherein each of the testing sets TSi has fewer than all of the N testing data segments and at least one of the testing sets TSi has different testing data segments than another of the initial testing sets TSi.
2 Assignments
0 Petitions
Accused Products
Abstract
Roughly described, a training database contains N segments of data samples. Candidate individuals identify a testing experience level, a fitness estimate, a rule set, and a testing set TSi of the data samples on which it is tested. The testing sets have fewer than all of the data segments and they are not all the same. Testing involves testing on only the individual'"'"'s assigned set of data segments, updating the fitness estimates and testing experience levels, and discarding candidates through competition. If an individual reaches a predetermined maturity level of testing experience, then validating involves further testing it on samples of the testing data from a testing data segment other than those in the individual'"'"'s testing set TSi. Those individuals that satisfy validation criteria are considered for deployment.
-
Citations
31 Claims
-
1. A computer-implemented method, for use with a testing database containing a plurality of samples of testing data, the samples being distributed among N>
- 1 segments of the testing data, each of the segments including at least one of the samples and at least one of the segments including more than one of the samples,
for use further with a memory storing a candidate database having a pool of candidate individuals, each i'"'"'th one of the candidate individuals identifying a testing experience level, a fitness estimate, a rule set for applying to samples of data in the testing data, and a respective testing set TSi of the testing data segments, the method comprising; for each i'"'"'th one of the candidate individuals; assigning to the i'"'"'th individual the respective testing set TSi of the testing data segments, a computer system testing the i'"'"'th individual on samples of the testing data from the i'"'"'th individual'"'"'s testing set TSi of testing data segments, updating the fitness estimate associated with the i'"'"'th individual in dependence upon results of the testing, and updating the testing experience level associated with the i'"'"'th individual in dependence upon the number of samples on which the i'"'"'th individual is tested; selecting individuals for discarding from the candidate pool in dependence upon a competition among candidate individuals; forming new individuals in the candidate pool by procreation in dependence upon a respective set of at least one parent individual from the candidate pool; a computer system validating, without further procreation as part of the validating step, a plurality of evolved individuals whose testing experience level has reached a predetermined maturity level without being selected for discarding, including further testing each such individual on samples of the testing data from a testing data segment other than those in the individual'"'"'s testing set TSi; and providing for deployment selected ones of the individuals in the plurality of evolved individuals that satisfy predetermined deployment criteria after validation, wherein each of the testing sets TSi has fewer than all of the N testing data segments and at least one of the testing sets TSi has different testing data segments than another of the initial testing sets TSi. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
- 1 segments of the testing data, each of the segments including at least one of the samples and at least one of the segments including more than one of the samples,
-
10. A computer-implemented system, for use with a testing database containing a plurality of samples of testing data, the samples being distributed among N>
- 1 segments of the testing data, at least one of the segments including more than one of the samples, comprising;
a memory storing a candidate database having a pool of candidate individuals, each i'"'"'th one of the candidate individuals identifying a testing experience level, a fitness estimate, a rule set for applying to samples of data in the testing data, and a respective testing set TSi of the testing data segments; and a training system which; for each i'"'"'th one of the candidate individuals; assigns to the i'"'"'th individual the respective testing set TSi of the testing data segments, tests the i'"'"'th individual on samples of the testing data from the i'"'"'th individual'"'"'s testing set TSi of testing data segments, updates the fitness estimate associated with the i'"'"'th individual in dependence upon results of the testing, and updates the testing experience level associated with the i'"'"'th individual in dependence upon the number of samples on which the i'"'"'th individual is tested; selects individuals for discarding from the candidate pool in dependence upon a competition among candidate individuals; forms new individuals in the candidate pool in dependence upon a respective set of at least one parent individual from the candidate pool; validates a plurality of evolved individuals whose testing experience level has reached a predetermined maturity level without being selected for discarding, including further testing each such individual on samples of the testing data from a testing data segment other than those in the individual'"'"'s testing set TSi without forming any new individuals in the candidate pool in dependence upon the individuals in the plurality of evolved individuals; and provides for deployment selected ones of the individuals that satisfy predetermined deployment criteria after validation, wherein each of the testing sets TSi has fewer than all of the N testing data segments and at least one of the testing sets TSi has different testing data segments than another of the initial testing sets TSi. - View Dependent Claims (11, 12, 13, 14, 15, 16)
- 1 segments of the testing data, at least one of the segments including more than one of the samples, comprising;
-
17. A computer-implemented data mining system, for use with a data mining testing database containing a plurality of samples of testing data, the samples being distributed among N>
- 1 segments of the testing data, at least one of the segments including more than one of the samples, comprising;
a memory storing a candidate database having a pool of candidate individuals, each i'"'"'th one of the candidate individuals identifying a testing experience level, a fitness estimate, a rule set for applying to samples of data in the testing data, and a testing set TSi of the testing data segments; a training system including; a testing module which tests each individual from the candidate pool on samples of the testing data from the individual'"'"'s testing set TSi of testing data segments, an updating module which updates the fitness estimate associated with each of the individuals being tested in dependence upon results of the testing, and which updates the testing experience level associated with each of the individuals in dependence upon the number of samples on which the individual is tested, a competition module which selects individuals for discarding from the candidate pool in dependence upon a competition among candidate individuals, a validation module which validates evolved individuals without further procreation, including further testing each such evolved individual on samples of the testing data from a testing data segment other than those in the evolved individual'"'"'s testing set TSi, the evolved individuals being individuals whose testing experience level on samples of testing set TSi has reached a predetermined maturity level without being selected for discarding; and a deployment module which provides for deployment selected ones of the individuals that satisfy predetermined deployment criteria after validation, wherein each of the testing sets TSi has fewer than all of the N testing data segments and at least one of the testing sets TSi has different testing data segments than another of the initial testing sets TSi. - View Dependent Claims (18)
- 1 segments of the testing data, at least one of the segments including more than one of the samples, comprising;
-
19. A computer-implemented method, for use with a testing database containing a plurality of samples of testing data, the samples being distributed among N>
- 1 segments of the testing data, each of the segments including at least one of the samples and at least one of the segments including more than one of the samples,
for use further with a memory storing a database having a pool of evolved individuals, each i'"'"'th one of the evolved individuals identifying a testing experience level, a fitness estimate, a rule set for applying to samples of data in the testing data, and having already reached a predetermined maturity level of testing experience on a respective testing set TSi of the testing data segments without having been selected for discarding, the method comprising; for each i'"'"'th one of the evolved individuals, and without further procreation in dependence upon any of the evolved individuals, performing validation steps of; further testing the i'"'"'th evolved individual on samples of the testing data from a testing data segment other than those in the individual'"'"'s testing set TSi, and updating the fitness estimate associated with the i'"'"'th evolved individual in dependence upon results of the testing; and providing for deployment selected ones of the evolved individuals in the pool of evolved individuals that satisfy predetermined deployment criteria after validation, wherein each of the testing sets TSi has fewer than all of the N testing data segments and at least one of the testing sets TSi has different testing data segments than another of the initial testing sets TSi. - View Dependent Claims (20)
- 1 segments of the testing data, each of the segments including at least one of the samples and at least one of the segments including more than one of the samples,
-
21. A computer-implemented method, for use in a server infrastructure with respect to a collection of at least one client device, for use further with a testing database containing a plurality of samples of testing data, the samples being distributed among N>
- 1 segments of the testing data, at least one of the segments including more than one of the samples,
wherein each i'"'"'th one of the client devices is associated with a corresponding testing set TSi of at least one but fewer than all of the testing data segments, at least one of the testing sets TSi having different testing data segments than another of the testing sets TSi, the method comprising; storing accessibly to the server infrastructure a database identifying a plurality of evolved individuals, each j'"'"'th one of the evolved individuals identifying a testing experience level, a fitness estimate, a rule set for applying to samples of data in the testing data, and all of the testing data segments on which the j'"'"'th individual has already reached a predetermined maturity level without being selected for discarding, each of the evolved individuals in the plurality of evolved individuals having already reached a predetermined maturity level on at least one of the testing data segments without having been selected for discarding; delegating each j'"'"'th one of the evolved individuals, to an i'"'"'th one of the client devices whose corresponding testing set TSi includes at least one testing data segment on which the j'"'"'th individual has not yet reached a predetermined maturity level, validation testing of the j'"'"'th individual on one or more of the testing data segments on which the j'"'"'th individual has not yet reached a predetermined maturity level; and providing for deployment selected ones of the evolved individuals that satisfy predetermined deployment criteria after validation. - View Dependent Claims (22, 23, 24, 25, 26)
- 1 segments of the testing data, at least one of the segments including more than one of the samples,
-
27. A computer-implemented method, for use by an particular client device in a system including a server infrastructure and a collection of at least one client device including the particular client device, for use further with a testing database containing a plurality of samples of testing data, the samples being distributed among N>
- 1 segments of the testing data, at least one of the segments including more than one of the samples, comprising;
associating the particular client device with a particular testing set TSp of at least one but fewer than all of the testing data segments; storing accessibly to the particular client device a database identifying a plurality of candidate individuals, each identifying a testing experience level, a fitness estimate, and a rule set for applying to samples of data in the testing data; performing, a number T≥
1 times, training steps of;for each i'"'"'th one of the candidate individuals; testing the i'"'"'th individual on samples of the testing data from the particular testing set of testing data segments; updating the fitness estimate associated with the i'"'"'th individual in dependence upon results of the testing; and updating the testing experience level associated with the i'"'"'th individual in dependence upon the number of samples on which the i'"'"'th individual is tested, selecting individuals for discarding from the candidate pool in dependence upon a competition among candidate individuals, and forming new individuals in the candidate pool in dependence upon a respective set of at least one parent individual from the candidate pool not selected for discarding; receiving evolved individuals for validation; and for each j'"'"'th one of the evolved individuals received, and before any further procreation in dependence upon the j'"'"'th evolved individual; testing the j'"'"'th evolved individual on samples of the testing data from the particular testing set of testing data segments, and reporting results of the testing to the server infrastructure. - View Dependent Claims (28, 29, 30, 31)
- 1 segments of the testing data, at least one of the segments including more than one of the samples, comprising;
Specification