Methods and apparatus for outlier detection for high dimensional data sets
First Claim
Patent Images
1. A computer implemented method of detecting one or more outliers in a data set, comprising the steps of:
- determining, via computer, one or more sets of dimensions and corresponding ranges in the data set which are sparse in density; and
determining, via computer, one or more data points in the data set which contain these sets of dimensions and corresponding ranges, the one or more data points being identified as the one or more outliers in the data set.
1 Assignment
0 Petitions
Accused Products
Abstract
Methods and apparatus are provided for outlier detection in databases by determining sparse low dimensional projections. These sparse projections are used for the purpose of determining which points are outliers. The methodologies of the invention are very relevant in providing a novel definition of exceptions or outliers for the high dimensional domain of data.
7 Citations
30 Claims
-
1. A computer implemented method of detecting one or more outliers in a data set, comprising the steps of:
-
determining, via computer, one or more sets of dimensions and corresponding ranges in the data set which are sparse in density; and determining, via computer, one or more data points in the data set which contain these sets of dimensions and corresponding ranges, the one or more data points being identified as the one or more outliers in the data set. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A computer implemented method of detecting one or more outliers in a data set, comprising the steps of:
-
identifying and mining, via computer, one or more patterns in the data set which have abnormally low presence not due to randomness; and identifying, via computer, one or more records which have the one or more patterns present in them as the one or more outliers.
-
-
11. Apparatus for detecting, via computer, one or more outliers in a data set, comprising:
-
at least one processor operative to;
(i) determine, via computer, one or more sets of dimensions and corresponding ranges in the data set which are sparse in density;and (ii) determine, via computer, one or more data points in the data set which contain these sets of dimensions and corresponding ranges, the one or more data points being identified as the one or more outliers in the data set. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19)
-
-
20. Apparatus for detecting, via computer, one or more outliers in a data set, comprising:
at least one processor operative to;
(i) identify and mine, via computer, one or more patterns in the data set which have abnormally low presence not due to randomness; and
(ii) identify, via computer, one or more records which have the one or more patterns present in them as the one or more outliers.
-
21. An article of manufacture for detecting, via computer, one or more outliers in a data set, comprising a machine readable medium containing one or more programs which when executed implement the steps of:
-
determining, via computer, one or more sets of dimensions and corresponding ranges in the data set which are sparse in density; and determining, via computer, one or more data points in the data set which contain these sets of dimensions and corresponding ranges, the one or more data points being identified as the one or more outliers in the data set. - View Dependent Claims (22, 23, 24, 25, 26, 27, 28, 29)
-
-
30. An article of manufacture for detecting, via computer, one or more outliers in a data set, comprising a machine readable medium containing one or more programs which when executed implement the steps of:
-
identifying and mining, via computer, one or more patterns in the data set which have abnormally low presence not due to randomness; and identifying, via computer, one or more records which have the one or more patterns present in them as the one or more outliers.
-
Specification