Method and apparatus for privacy preserving data mining by restricting attribute choice
First Claim
1. A method of generating at least one output data set from at least one input data set for use in association with a data mining process, comprising the steps of:
- selecting at least one relevant attribute of the at least one input data set through determination of at least one relevance coefficient; and
generating the at least one output data set from the at least one input data set wherein the at least one output data set comprises the at least one relevant attribute of the at least one input data set, as determined by use of the at least one relevance coefficient.
2 Assignments
0 Petitions
Accused Products
Abstract
Improved techniques for privacy preserving data mining of multidimensional data records are disclosed. For example, a technique for generating at least one output data set from at least one input data set for use in association with a data mining process comprises the following steps/operations. At least one relevant attribute of the at least one input data set is selected through determination of at least one relevance coefficient. The at least one output data set is generated from the at least one input data set, wherein the at least one output data set comprises the at least one relevant attribute of the at least one input data set, as determined by use of the at least one relevance coefficient.
-
Citations
23 Claims
-
1. A method of generating at least one output data set from at least one input data set for use in association with a data mining process, comprising the steps of:
-
selecting at least one relevant attribute of the at least one input data set through determination of at least one relevance coefficient; and
generating the at least one output data set from the at least one input data set wherein the at least one output data set comprises the at least one relevant attribute of the at least one input data set, as determined by use of the at least one relevance coefficient. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. Apparatus for generating at least one output data set from at least one input data set for use in association with a data mining process, the apparatus comprising:
-
a memory; and
at least one processor coupled to the memory and operative to;
(i) select at least one relevant attribute of the at least one input data set through determination of at least one relevance coefficient and (ii) generate the at least one output data set from the at least one input data set wherein the output data set comprises the at least relevant attribute of the at least one input data set, as determined by use of the at least one relevance coefficient. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20, 22)
-
-
21. The method of claim 21, wherein the at least one distance function is more strongly weighted by features which have at least one relevance coefficient.
-
23. An article of manufacture for generating at least one output data set from at least one input data set for use in association with a data mining process, comprising a machine readable medium containing one or more programs which when executed implement the steps of:
-
selecting at least one relevant attribute of the at least one input data set through determination of at least one relevance coefficient and generating the at least one output data set from the at least one input data set wherein the at least one output data set comprises only the most relevant attributes of the at least one input data set, as determined by use of the at least one relevance coefficient.
-
Specification