METHOD AND SYSTEM FOR PREDICTING CONSUMER BEHAVIOR
First Claim
1. Method of predicting consumer response to given content, including the steps of collecting a dataset of consumer response to the content, each data item including values for a selected set of segmentation variables related to past consumer behavior and the dataset containing at least twice the number of entries to provide statistical validity;
- constructing a classification tree structure using the dataset, wherein the dataset is subdivided into learning and validation datasets of substantially equal size;
the criterion for each successive split is the lowest entropy of segmentation variables not employed to the point of such split; and
each successive split of the learning dataset is performed only if such split produces child nodes statistically different from one another; and
an identical split of the validation data set produces child nodes statistically similar to child nodes produced on the learning dataset;
receiving a data item related to a new consumer, including values for the segmentation variables;
computing the likely response of the new consumer to the content, employing the classification tree data structure.
5 Assignments
0 Petitions
Accused Products
Abstract
A method of predicting consumer response to given content. The process begins with the step of collecting a dataset of consumer response to the content, each data item including values for a selected set of segmentation variables related to past consumer behavior. The dataset contains at least twice the number of entries required to provide statistical validity. The process continues by constructing a classification tree structure using the dataset, in which the dataset is subdivided into learning and validation datasets of substantially equal size. Also, the criterion for each successive split is the lowest entropy of segmentation variables not employed to the point of such split. Each successive split of the learning dataset is performed only if that split produces child nodes statistically different from one another, and an identical split of the validation data set produces child nodes statistically similar to child nodes produced on the learning dataset. The system estimates consumer responses by first receiving a data item related to a new consumer, including values for the segmentation variables and then computing the likely response of the new consumer to the content, employing the classification tree data structure.
-
Citations
18 Claims
-
1. Method of predicting consumer response to given content, including the steps of
collecting a dataset of consumer response to the content, each data item including values for a selected set of segmentation variables related to past consumer behavior and the dataset containing at least twice the number of entries to provide statistical validity; -
constructing a classification tree structure using the dataset, wherein the dataset is subdivided into learning and validation datasets of substantially equal size;
the criterion for each successive split is the lowest entropy of segmentation variables not employed to the point of such split; and
each successive split of the learning dataset is performed only if such split produces child nodes statistically different from one another; and
an identical split of the validation data set produces child nodes statistically similar to child nodes produced on the learning dataset;
receiving a data item related to a new consumer, including values for the segmentation variables;
computing the likely response of the new consumer to the content, employing the classification tree data structure. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. Method of predicting consumer response to given content presented in connection with viewing a website on the internet, including the steps of
collecting a dataset of consumer response to the content, each data item including values for a selected set of segmentation variables related to past consumer internet behavior, the dataset containing at least twice the number of entries to provide statistical validity; -
constructing a classification tree structure using the dataset, wherein the dataset is subdivided into learning and validation datasets of substantially equal size;
the criterion for each successive split is the lowest entropy of segmentation variables not employed to the point of such split; and
each successive split of the learning dataset is performed only if such split produces child nodes statistically different from one another; and
an identical split of the validation data set produces child nodes statistically similar to child nodes produced on the learning dataset;
receiving a data item related to a new internet consumer, including values for the segmentation variables;
computing the likely response of the new consumer to the content, employing the classification tree data structure. - View Dependent Claims (8, 9, 10, 11, 12)
-
-
13. A classification tree data structure useful for predicting consumer response to given content, wherein the tree structure is constructed by a process including the steps of
subdividing the dataset into learning and validation datasets of substantially equal size; -
determining each successive split based on the lowest entropy of segmentation variables not employed to the point of such split; and
performing successive split of the learning dataset only if such split produces child nodes statistically different from one another; and
an identical split of the validation data set produces child nodes statistically similar to child nodes produced on the learning dataset. - View Dependent Claims (14, 15, 16, 17)
-
-
18. Method of predicting consumer response to given content, including the steps of
assembling a library of binary tree tools, including the steps of building a consumer response dataset, including the steps of exposing consumers to selected content; -
collecting each consumer response, measured as a value of a response variable;
collecting consumer segmentation characteristics, measured as values of each of a set of consumer segmentation variables;
continuing the collection until the dataset consists of at least twice the number of data items required for a statistically valid sample;
dividing the dataset into a learning set and a validation set, based on a variable independent of either the response variable or any segmentation variable, the datasets being substantially equal in size and each being sufficiently large to provide statistical reliability;
constructing a binary tree by successively splitting nodes, each splitting step including the steps of employing the learning dataset to obtain a proposed split, including splitting the node hypothetically, based on each value of each segmentation variable;
calculating the entropy of each hypothetical split;
choosing the split having the minimum entropy as the proposed split;
performing a statistical test on the resulting nodes to determine whether they differ statistically;
collapsing the proposed split in the event no difference is found;
validating the proposed split, including replicating the proposed split on the validation dataset;
performing a statistical test on the resulting nodes to determine whether they are statistically similar to like nodes of the proposed split;
collapsing the proposed split in the event that no similarity is found;
continuing the tree construction process, with each successive split employing only those segmentation variables not employed in an adopted split;
receiving data concerning an individual consumer, including values for the set of segmentation variables;
determining the most appropriate content to present to the consumer, including the steps of obtaining a value for the consumer dataset for each binary tree tool in the library; and
selecting the content associated with the binary tree tool producing the highest response value.
-
Specification