Methods and apparatus for privacy preserving data mining using statistical condensing approach
First Claim
1. A method of generating at least one output data set from at least one input data set for use in association with a data mining process, comprising the steps of:
- generating data statistics from the at least one input data set; and
generating the at least one output data set from the data statistics, wherein the output data set differs from the input data set but maintains one or more correlations from within the input data set.
1 Assignment
0 Petitions
Accused Products
Abstract
Methods and apparatus for generating at least one output data set from at least one input data set for use in association with a data mining process are provided. First, data statistics are constructed from the at least one input data set. Then, an output data set is generated from the data statistics. The output data set differs from the input data set but maintains one or more correlations from within the input data set. The correlations may be the inherent correlations between different dimensions of a multidimensional input data set. A significant amount of information from the input data set may be hidden so that the privacy level of the data mining process may be increased.
52 Citations
25 Claims
-
1. A method of generating at least one output data set from at least one input data set for use in association with a data mining process, comprising the steps of:
-
generating data statistics from the at least one input data set; and
generating the at least one output data set from the data statistics, wherein the output data set differs from the input data set but maintains one or more correlations from within the input data set. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. Apparatus for generating at least one output data set from at least one input data set for use in association with a data mining process, the apparatus comprising:
-
a memory; and
at least one processor coupled to the memory operative to;
(i) generate data statistics from the at least one input data set; and
(ii) generate the at least one output data set from the data statistics, wherein the output data set differs from the input data set but maintains one or more correlations from within the input data set. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24)
-
-
25. An article of manufacture for generating at least one output data set from at least one input data set for use in association with a data mining process, comprising a machine readable medium containing one or more programs which when executed implement the steps of:
-
generating data statistics from the at least one input data set; and
generating the at least one output data set from the data statistics, wherein the output data set differs from the input data set but maintains one or more correlations from within the input data set.
-
Specification