Ensemble learning system and method
First Claim
1. A learning system comprising:
- an input section which obtains learning data to which labels are set and an end condition;
a learning section which learns said learning data through an ensemble learning by using a learning algorithm to generate hypotheses;
a storage section in which a plurality of candidate data having no label are stored;
a calculating section which carries out an averaging with weights to said plurality of hypotheses, refers to said storage section and calculates a score for each of said plurality of candidate data by using said hypotheses;
a selecting section which selects desired candidate data from among said plurality of candidate data based on the calculated scores, said selecting section having a previously set stochastic selection function;
a data updating section which sets a label determined by a user to said desired candidate data and adds said desired candidate data to said learning data and outputs to said learning section;
an output unit; and
a control unit which outputs said hypotheses generated by said learning section to said output unit when said end condition is met,wherein said learning section re-samples said learning data to generate partial data by said ensemble learning, and re-samples an attribute of said learning data to generate a partial attribute, and learns said learning data based on said partial data and said partial attribute, wherein said calculated scare has a numeric value of a likelihood of a positive example of each candidate data.
1 Assignment
0 Petitions
Accused Products
Abstract
A learning system that can predict a desired result, and can have stable and improved prediction precision is presented. The learning system includes a learning section which learns the learning data using a learning algorithm to generate hypotheses, a storage section containing at least a plurality of un-labeled candidate data, a calculating section which uses the hypotheses to calculate a score for each of the plurality of candidate data, a selecting section that selects desired candidate data based on the calculated scores and a predetermined stochastic selection function, a data updating section which affixes a user-determined label to the desired candidate data and outputs the desired candidate data to the learning data, and a control unit which outputs the hypotheses to an output unit when an end condition is met, so that a desired result is predicted.
17 Citations
5 Claims
-
1. A learning system comprising:
-
an input section which obtains learning data to which labels are set and an end condition; a learning section which learns said learning data through an ensemble learning by using a learning algorithm to generate hypotheses; a storage section in which a plurality of candidate data having no label are stored; a calculating section which carries out an averaging with weights to said plurality of hypotheses, refers to said storage section and calculates a score for each of said plurality of candidate data by using said hypotheses; a selecting section which selects desired candidate data from among said plurality of candidate data based on the calculated scores, said selecting section having a previously set stochastic selection function; a data updating section which sets a label determined by a user to said desired candidate data and adds said desired candidate data to said learning data and outputs to said learning section; an output unit; and a control unit which outputs said hypotheses generated by said learning section to said output unit when said end condition is met, wherein said learning section re-samples said learning data to generate partial data by said ensemble learning, and re-samples an attribute of said learning data to generate a partial attribute, and learns said learning data based on said partial data and said partial attribute, wherein said calculated scare has a numeric value of a likelihood of a positive example of each candidate data. - View Dependent Claims (2)
-
-
3. A learning method of learning data with a label set, comprising:
-
(a) inputting, on an input unit, said learning data and an end condition; (b) generating hypotheses through learning of said learning data through an ensemble learning by using a learning algorithm, wherein a plurality of candidate data with no label are stored in a storage section; (c) calculating a score for each of said plurality of candidate data by using said hypotheses by referring to said storage section; (d) selecting, on a data processing unit, using a previously set stochastic selection function, a desired candidate data from among said plurality of candidate data based on the calculated scores; (e) setting a label determined by the user to said desired candidate data selected by the user; (f) carrying out said (b) step by adding said desired candidate data to said learning data; (g) carrying out said (c) step, said (d) step, said (e) step and said (f) step when said end condition is not met; and (h) outputting, on an output unit having at least one of a display unit and a printer, said hypotheses generated in said (f) step to an output unit when said end condition is met, wherein said (b) step comprises; re-sampling said learning data to generate partial data by said ensemble learning; re-sampling an attribute of said learning data to generate a partial attribute; and learning, on the data processing unit, said learning data based on said partial data and said partial attribute;
wherein the input unit and the output unit are connected with the data processing unit, wherein said calculated score has a numeric value of a likelihood of a positive example of each candidate data. - View Dependent Claims (4)
-
-
5. A computer-readable medium having computer readable program for operating on a computer for realizing a learning method of a learning data with a label set, said program comprising instructions that cause the computer to perform the steps of:
-
(a) inputting, on an input unit, said learning data and an end condition; (b) generating hypotheses through learning of said learning data through an ensemble learning by using a learning algorithm, wherein a plurality of candidate data with no label are stored in a storage section; (c) calculating a score for each of said plurality of candidate data by using said hypotheses by referring to said storage section; (d) selecting, on a data processing unit, using a previously set stochastic selection function, a desired candidate data from among said plurality of candidate data based on the calculated scores; (e) setting a label determined by the user to said desired candidate data selected by the user; (f) carrying out said (b) step by adding said desired candidate data to said learning data; (g) carrying out said (c) step, said (d) step, said (e) step and said (f) step when said end condition is not met; and (h) outputting, on an output unit having at least one of a display unit and a printer, said hypotheses generated in said (f) step to an output unit when said end condition is met, wherein said (b) step comprises; re-sampling said learning data to generate partial data by said ensemble learning; re-sampling an attribute of said learning data to generate a partial attribute; and learning, on the data processing unit, said learning data based on said partial data and said partial attribute, and wherein the input unit and the output unit are connected with the data processing unit, wherein said calculated score has a numeric value of a likelihood of a positive example of each candidate data.
-
Specification