Using specialized workers to improve performance in machine learning
First Claim
Patent Images
1. A method implemented by a computerized machine learning system, said method comprising:
- receiving, at the computerized machine learning system, a plurality of examples, separable by feature into at least two classes, for distribution to a plurality of workers in a mapreduce process, each worker only receiving examples associated with a first class or a second class, wherein the first class is a positive class and the second class is a negative class, and wherein a worker is selected from the group consisting of a mapper and a reducer;
determining whether each example is either associated with the first class or associated with the second class;
distributing an example associated with the first class to a first worker of the plurality of workers in the machine learning system, the first worker receiving only examples associated with the first class; and
distributing an example associated with the second class to a second worker of the plurality of workers in the machine learning system, the second worker receiving only examples associated with the second class.
2 Assignments
0 Petitions
Accused Products
Abstract
Systems and techniques are disclosed for generating weighted machine learned models using multi-shard combiners. A learner in a machine learning system may receive labeled positive and negative examples and workers within the learner may be configured to receive either positive or negative examples. A positive and negative statistic may be calculated for a given feature and may either be applied separately in a model or may be combined to generate an overall statistic.
18 Citations
36 Claims
-
1. A method implemented by a computerized machine learning system, said method comprising:
-
receiving, at the computerized machine learning system, a plurality of examples, separable by feature into at least two classes, for distribution to a plurality of workers in a mapreduce process, each worker only receiving examples associated with a first class or a second class, wherein the first class is a positive class and the second class is a negative class, and wherein a worker is selected from the group consisting of a mapper and a reducer; determining whether each example is either associated with the first class or associated with the second class; distributing an example associated with the first class to a first worker of the plurality of workers in the machine learning system, the first worker receiving only examples associated with the first class; and distributing an example associated with the second class to a second worker of the plurality of workers in the machine learning system, the second worker receiving only examples associated with the second class. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A machine learning system comprising:
-
one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising; receiving a plurality of examples, separable by feature into at least two classes, for distribution to a plurality of workers in a mapreduce process, each worker only receiving examples associated with a first class or a second class, wherein the first class is a positive class and the second class is a negative class, and wherein a worker is selected from the group consisting of a mapper and a reducer; determining whether each example is either associated with the first class or associated with the second class; distributing an example associated with the first class to a first worker of the plurality of workers in the machine learning system, the first worker receiving only examples associated with the first class; and distributing an example associated with the second class to a second worker of the plurality of workers in the machine learning system, the second worker receiving only examples associated with the second class. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24)
-
-
25. A computer program product, encoded on one or more non-transitory computer storage media, comprising instructions that when executed by one or more computers cause the one or more computers to perform operations comprising:
-
receiving, at a computerized machine learning system, a plurality of examples, separable by feature into at least two classes, for distribution to a plurality of workers in a mapreduce process, each worker only receiving examples associated with a first class or a second class, wherein the first class is a positive class and the second class is a negative class, and wherein a worker is selected from the group consisting of a mapper and a reducer; determining whether each example is either associated with the first class or associated with the second class; distributing an example associated with the first class to a first worker of the plurality of workers in the machine learning system, the first worker receiving only examples associated with the first class; and distributing an example associated with the second class to a second worker of the plurality of workers in the machine learning system, the second worker receiving only examples associated with the second class. - View Dependent Claims (26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36)
-
Specification