×

Method for identifying outliers in large data sets

  • US 6,643,629 B2
  • Filed: 11/18/1999
  • Issued: 11/04/2003
  • Est. Priority Date: 11/18/1999
  • Status: Expired due to Fees
First Claim
Patent Images

1. A computer implemented method embedded in a recordable media for identifying a predetermined number of outliers of interest in a data set, comprising the steps of:

  • partitioning a plurality of data points in the data set into a plurality of partitions;

    computing lower and upper bounds for each one of the plurality of partitions to identify those partitions that cannot possibly contain the predetermined number of outliers of interest;

    identifying a plurality of candidate partitions from the plurality of partitions, wherein each one of the plurality of candidate partitions possibly contains at least one of a predetermined number of outliers of interest, wherein the predetermined number of outliers of interest are included within the plurality of data points in the data set; and

    identifying the predetermined number of outliers of interest from the plurality of candidate partitions wherein the data set has N data points and the predetermined number of outliers of interest is n which each have k neighboring data points, a data point p is one of the predetermined number of outliers of interest n if no more than n−

    1 other points in the data set reside at greater distances from the k neighboring data point than data point p.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×