METHOD AND COMPUTER PROGRAM PRODUCT FOR USING DATA MINING TOOLS TO AUTOMATICALLY COMPARE AN INVESTIGATED UNIT AND A BENCHMARK UNIT
First Claim
1. A method of comparing an investigated entity to a reference entity, the method comprising:
- augmenting a plurality of data points that correspond to a variable or characteristic of the investigated entity or the reference entity by creating a target variable whose value is indicative of whether the respective data point is associated with the investigated entity or the reference entity;
performing logistic regression upon the augmented data points with the target variable used as a dependent variable in performing the logistic regression;
receiving from the logistic regression a plurality of standardized values of regression coefficients for the submitted variables; and
identifying variables whose standardized values exceed a specified threshold and are thereby considered significant.
2 Assignments
0 Petitions
Accused Products
Abstract
Sources of operational problems in business transactions often show themselves in relatively small pockets of data, which are called trouble hot spots. Identifying these hot spots from internal company transaction data is generally a fundamental step in the problem'"'"'s resolution, but this analysis process is greatly complicated by huge numbers of transactions and large numbers of transaction variables to analyze. A suite of practical modifications are provided to data mining techniques and logistic regressions to tailor them for finding trouble hot spots. This approach thus allows the use of efficient automated data mining tools to quickly screen large numbers of candidate variables for their ability to characterize hot spots. One application is the screening of variables which distinguish a suspected hot spot from a reference set.
-
Citations
1 Claim
-
1. A method of comparing an investigated entity to a reference entity, the method comprising:
-
augmenting a plurality of data points that correspond to a variable or characteristic of the investigated entity or the reference entity by creating a target variable whose value is indicative of whether the respective data point is associated with the investigated entity or the reference entity; performing logistic regression upon the augmented data points with the target variable used as a dependent variable in performing the logistic regression; receiving from the logistic regression a plurality of standardized values of regression coefficients for the submitted variables; and identifying variables whose standardized values exceed a specified threshold and are thereby considered significant.
-
Specification