AUTOMATIC ELIMINATION OF NOISE FOR BIG DATA ANALYTICS
First Claim
1. At least one machine readable storage medium having instructions stored thereon, the instructions when executed by a machine to cause the machine to:
- identify a set of target features for a plurality of data instances of an input data collection;
determine feature values for the set of target features for the plurality of data instances;
identify a plurality of outlier data instances based on the determined feature values;
identify a plurality of noisy data instances from the outlier data instances based on feature values of the plurality of noisy data instances, wherein a noisy data instance is identified based on a determination that noise is present in noisy data instance; and
provide an indication of the plurality of noisy data instances.
1 Assignment
0 Petitions
Accused Products
Abstract
A method comprising identifying a set of target features for a plurality of data instances of an input data collection; determining feature values for the set of target features for the plurality of data instances; identifying a plurality of outlier data instances based on the determined feature values; identifying a plurality of noisy data instances from the outlier data instances based on feature values of the plurality of noisy data instances, wherein a noisy data instance is identified based on a determination that noise is present in noisy data instance; and providing an indication of the plurality of noisy data instances.
9 Citations
20 Claims
-
1. At least one machine readable storage medium having instructions stored thereon, the instructions when executed by a machine to cause the machine to:
-
identify a set of target features for a plurality of data instances of an input data collection; determine feature values for the set of target features for the plurality of data instances; identify a plurality of outlier data instances based on the determined feature values; identify a plurality of noisy data instances from the outlier data instances based on feature values of the plurality of noisy data instances, wherein a noisy data instance is identified based on a determination that noise is present in noisy data instance; and provide an indication of the plurality of noisy data instances. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A method comprising:
-
identifying a set of target features for a plurality of data instances of an input data collection; determining feature values for the set of target features for the plurality of data instances; identifying a plurality of outlier data instances based on the determined feature values; identifying a plurality of noisy data instances from the outlier data instances based on feature values of the plurality of noisy data instances, wherein a noisy data instance is identified based on a determination that noise is present in noisy data instance; and providing an indication of the plurality of noisy data instances. - View Dependent Claims (12, 13, 14, 15)
-
-
16. An apparatus comprising:
-
a memory to store an input data collection comprising a plurality of data instances; and a processor coupled to the memory, the processor to; identify a set of target features for the plurality of data instances of the input data collection; determine feature values for the set of target features for the plurality of data instances; identify a plurality of outlier data instances based on the determined feature values; identify a plurality of noisy data instances from the outlier data instances based on feature values of the plurality of noisy data instances, wherein a noisy data instance is identified based on a determination that noise is present in noisy data instance; and provide an indication of the plurality of noisy data instances. - View Dependent Claims (17, 18, 19, 20)
-
Specification