Methods for estimating the seasonality of groups of similar items of commerce data sets based on historical sales data values and associated error information
First Claim
1. A computer-based method of clustering data sets comprising executing on a computer the steps of receiving a first set of data containing a plurality of data points, each of which is expressed as a value and an associated error, determining a distance between the first set of data and each of one or more other sets of data, where each of those other data sets contains a plurality of data points, each of which is expressed as a value and an associated error, the determining step including measuring each distance as a function, at least in part, of the error associated with one or more data points in each of the sets of data for which the distance is determined, clustering the first set of data with at least one of the other sets of data, the clustering step including comparing a threshold one or more distances determined in the determining step, generating a composite data set as a function of data points contained in the first set of data and the one or more other sets of data, if any, whose distances compared favorably with the threshold.
1 Assignment
0 Petitions
Accused Products
Abstract
A set of data is received containing values associated with respective data points, the values associated with each of the data points being characterized by a distribution. The values for each of the data points are expressed in a form that includes information about a distribution of the values for each of the data points. The distribution information is used in clustering the set of data with at least one other set of data containing values associated with data points.
85 Citations
11 Claims
-
1. A computer-based method of clustering data sets comprising executing on a computer the steps of
receiving a first set of data containing a plurality of data points, each of which is expressed as a value and an associated error, determining a distance between the first set of data and each of one or more other sets of data, where each of those other data sets contains a plurality of data points, each of which is expressed as a value and an associated error, the determining step including measuring each distance as a function, at least in part, of the error associated with one or more data points in each of the sets of data for which the distance is determined, clustering the first set of data with at least one of the other sets of data, the clustering step including comparing a threshold one or more distances determined in the determining step, generating a composite data set as a function of data points contained in the first set of data and the one or more other sets of data, if any, whose distances compared favorably with the threshold.
Specification