Methods and Apparatus for Outlier Detection for High Dimensional Data Sets
First Claim
Patent Images
1. A method of detecting one or more outliers in a data set, comprising the steps of:
- determining one or more sets of dimensions and corresponding ranges in the data set which are sparse in density; and
determining one or more data points in the data set which contain these sets of dimensions and corresponding ranges, the one or more data points being identified as the one or more outliers in the data set.
1 Assignment
0 Petitions
Accused Products
Abstract
Methods and apparatus are provided for outlier detection in databases by determining sparse low dimensional projections. These sparse projections are used for the purpose of determining which points are outliers. The methodologies of the invention are very relevant in providing a novel definition of exceptions or outliers for the high dimensional domain of data.
-
Citations
30 Claims
-
1. A method of detecting one or more outliers in a data set, comprising the steps of:
-
determining one or more sets of dimensions and corresponding ranges in the data set which are sparse in density; and determining one or more data points in the data set which contain these sets of dimensions and corresponding ranges, the one or more data points being identified as the one or more outliers in the data set. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A method of detecting one or more outliers in a data set, comprising the steps of:
-
identifying and mining one or more patterns in the data set which have abnormally low presence not due to randomness; and identifying one or more records which have the one or more patterns present in them as the one or more outliers.
-
-
11. Apparatus for detecting one or more outliers in a data set, comprising:
at least one processor operative to;
(i) determine one or more sets of dimensions and corresponding ranges in the data set which are sparse in density; and
(ii) determine one or more data points in the data set which contain these sets of dimensions and corresponding ranges, the one or more data points being identified as the one or more outliers in the data set.- View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19)
-
20. Apparatus for detecting one or more outliers in a data set, comprising:
at least one processor operative to;
(i) identify and mine one or more patterns in the data set which have abnormally low presence not due to randomness; and
(ii) identify one or more records which have the one or more patterns present in them as the one or more outliers.
-
21. An article of manufacture for detecting one or more outliers in a data set, comprising a machine readable medium containing one or more programs which when executed implement the steps of:
-
determining one or more sets of dimensions and corresponding ranges in the data set which are sparse in density; and determining one or more data points in the data set which contain these sets of dimensions and corresponding ranges, the one or more data points being identified as the one or more outliers in the data set. - View Dependent Claims (22, 23, 24, 25, 26, 27, 28, 29)
-
-
30. An article of manufacture for detecting one or more outliers in a data set, comprising a machine readable medium containing one or more programs which when executed implement the steps of:
-
identifying and mining one or more patterns in the data set which have abnormally low presence not due to randomness; and identifying one or more records which have the one or more patterns present in them as the one or more outliers.
-
Specification