PREDICTING BEHAVIOR USING FEATURES DERIVED FROM STATISTICAL INFORMATION
First Claim
1. A method, performed by one or more computing devices, for generating a prediction model, comprising:
- receiving a master dataset that provides plural training examples, each training example being associated with;
one or more aspects of an event, and corresponding one or more aspect values; and
a label associated with the event;
for a particular aspect, producing plural partitions based on a partitioning strategy, the plural partitions being associated with plural respective subsets of aspect values;
identifying plural subsets of data within the master dataset that pertain to the respective plural partitions, and generating plural instances of statistical information based on the respective subsets of data, the plural instances of statistical information corresponding to feature information that reflects a distribution of labels in the subsets of data;
generating a prediction model based on the feature information and a set of training examples; and
storing the prediction model in a data store.
3 Assignments
0 Petitions
Accused Products
Abstract
A training system is described herein for generating a prediction model that relies on a feature space with reduced dimensionality. The training system performs this task by producing partitions, each of which corresponds to a subset of aspect values (where each aspect value, in turn, may correspond to one or more attribute values). The training system then produces instances of statistical information associated with the partitions. Each instance of statistical information therefore corresponds to feature information that applies to a plurality of aspect values, rather than a single aspect value. The training system then trains the prediction model based on the feature information. Also described herein is a prediction module that uses the prediction model to make predictions in various online contexts.
24 Citations
20 Claims
-
1. A method, performed by one or more computing devices, for generating a prediction model, comprising:
-
receiving a master dataset that provides plural training examples, each training example being associated with; one or more aspects of an event, and corresponding one or more aspect values; and a label associated with the event; for a particular aspect, producing plural partitions based on a partitioning strategy, the plural partitions being associated with plural respective subsets of aspect values; identifying plural subsets of data within the master dataset that pertain to the respective plural partitions, and generating plural instances of statistical information based on the respective subsets of data, the plural instances of statistical information corresponding to feature information that reflects a distribution of labels in the subsets of data; generating a prediction model based on the feature information and a set of training examples; and storing the prediction model in a data store. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A computer readable storage medium for storing computer readable instructions, the computer readable instructions providing a training system when executed by one or more processing devices, the computer readable instructions comprising:
-
logic configured to receive a master dataset that provides plural training examples; logic configured to produce plural partitions based on a partitioning strategy; logic configured to; identify plural subsets of data within the master dataset that pertain to the respective plural partitions; and generate plural instances of statistical information based on the respective subsets of data, the plural instances of statistical information corresponding to feature information; and logic configured to train a prediction model based on the feature information and a set of training examples. - View Dependent Claims (14, 15, 16, 17)
-
-
18. One or more computing devices for implementing a prediction module, comprising:
-
a data store that provides at least one lookup data structure, each lookup data structure being associated with a particular aspect of events, and each lookup data structure identifying; plural partitions; and plural instances of statistical information associated with the plural partitions, each instance of statistical information applying to a subset of aspect values; a feature lookup module configured to; receive input information associated with a new event; identify one or more aspect values associated with the input information; and identify, using said at least one lookup data structure, statistical information that is associated with the new event, based on said one or more aspect values, to produce identified statistical information; and a prediction generation module configured to generate a prediction based on the identified statistical information. - View Dependent Claims (19, 20)
-
Specification