METHOD AND APPARATUS FOR DISTRIBUTION-INDEPENDENT OUTLIER DETECTION IN STREAMING DATA
First Claim
1. A distribution-independent method for iteratively detecting outliers in streamed data comprising the data sequence XW, being sequentially assigned to adaptive bins having expanding range, the method comprising the steps:
- reading each item x of XW;
assigning each item x to a bin having a range that comprises item x;
for every N'"'"'th read item,merging all bins with overlapping or adjoining ranges;
assessing a bin for insider preclusion;
extracting outlier information when XW has been processed; and
delivering output.
2 Assignments
0 Petitions
Accused Products
Abstract
The present invention relates to an iterative method and an apparatus for distribution-independent detection of intermediate outliers and outliers in the distribution tail of streamed data. A considerable sequence of streamed data is sequentially read and subsequently assigned to matching bins. The bins are adaptively allocated when, where and if they are needed. Each bin range expands concurrently with the distribution range of the accumulating items assigned to the bin, adding a margin. For every N'"'"'th read item, overlapping or adjoining bins are merged, whereupon the bins are assessed for insider preclusion. Information regarding outliers is extracted from the remaining outlier bins when the entire data sequence has been processed.
-
Citations
15 Claims
-
1. A distribution-independent method for iteratively detecting outliers in streamed data comprising the data sequence XW, being sequentially assigned to adaptive bins having expanding range, the method comprising the steps:
-
reading each item x of XW; assigning each item x to a bin having a range that comprises item x; for every N'"'"'th read item, merging all bins with overlapping or adjoining ranges; assessing a bin for insider preclusion; extracting outlier information when XW has been processed; and delivering output. - View Dependent Claims (2, 3, 4, 5)
-
-
6. An apparatus (100) for distribution-independent iterative detection of outliers in streamed data comprising the data sequence XW sequentially assigned to adaptive bins having expanding range, said apparatus (100) comprising an input unit (120), a processing unit (140), a cache memory (160) and an output unit (180), where said cache memory (160) is adapted to store and alter data posts representing bins upon request, and where said processing unit (140) is adapted to send such requests to the cache memory (160);
-
read via the input unit (120) each item x of XW; assign each item x to a bin having a range that comprises item x; merge bins with overlapping or adjoining bounds for every N'"'"'th read item; assess bins for insider preclusion; extract outlier information when the entire XW has been processed; and deliver the outlier information to the output unit (180). - View Dependent Claims (7, 8, 9, 10)
-
-
11. A computer readable media product including program instructions which when executed by a processor cause the processor to perform a distribution-independent method for iteratively detecting outliers in streamed data comprising the data sequence XW, being sequentially assigned to adaptive bins having expanding range, the method comprising the steps:
-
reading each item x of XW; assigning each item x to a bin having a range that comprises item x; for every N'"'"'th read item, merging all bins with overlapping or adjoining ranges; assessing a bin for insider preclusion; extracting outlier information when XW has been processed; and delivering output. - View Dependent Claims (12, 13, 14, 15)
-
Specification