×

Method and computer program product for using data mining tools to automatically compare an investigated unit and a benchmark unit

  • US 8,306,997 B2
  • Filed: 05/27/2011
  • Issued: 11/06/2012
  • Est. Priority Date: 12/05/2005
  • Status: Expired due to Fees
First Claim
Patent Images

1. A processor implemented method of comparing an investigated entity to a reference entity, the method comprising:

  • augmenting via a processor a plurality of data points that correspond to variables of an investigated entity or a reference entity by creating a target variable whose value is indicative of whether the respective data point is associated with the investigated entity or the reference entity;

    structuring the plurality of data points according to at least one of a plurality of preprocessing modalities,wherein the plurality of preprocessing modalities include trimming outlier data points and transforming variables to near symmetry, standardizing variables, and screening variables,wherein screening variables comprises performing decision tree analysis on the plurality of data points to identify variables having an effect on the target variable;

    performing via the processor logistic regression upon the augmented data points with the target variable used as a dependent variable in performing the logistic regression;

    receiving via the processor from the logistic regression a plurality of standardized values of regression coefficients for the variables;

    ranking each of the variables corresponding to the plurality of augmented data points in order of one of a plurality of test statistic types;

    identifying via the processor significant variables whose standardized values exceed a specified threshold and are thereby considered significant;

    generating via the processor at least one interaction variable between a first significant variable and a second significant variable of the identified significant variables; and

    identifying via the processor based on the at least one interaction variable second significant variable values for which interaction variable values exceed the specified threshold.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×