System and method for distributed privacy preserving data mining
First Claim
1. A method of determining at least one large itemset in a privacy-preserving manner in a distributed computing environment including a plurality of entities, comprising the steps of:
- a first entity of the plurality of entities exchanging summary information with a second entity of the plurality of entities via a privacy-preserving data sharing protocol such that the privacy of the summary information is preserved, the summary information associated with an entity relating to transactional data stored at the entity; and
the first entity determining at least one large itemset based on at least the summary information obtained from the second entity via the privacy-preserving data sharing protocol;
wherein the summary information exchanging step further comprises;
the first entity transmitting a first random number to the second entity;
the first entity receiving from the second entity a first result, the first result representing a summation of the first random number and a second random number associated with the second entity;
the first entity transmitting a second result to the second entity, the second result representing a summation of the first result and summary information relating to data stored at the first entity;
the first entity receiving a third result from the second entity, the third result representing a summation of the second result and summary information relating to data stored at the second entity;
the first entity transmitting a fourth result to the second entity, the fourth result representing a subtraction of the first random number from the third result; and
the first entity receiving a fifth result from the second entity, the fifth result representing a subtraction of the second random number from the fourth result.
1 Assignment
0 Petitions
Accused Products
Abstract
Distributed privacy preserving data mining techniques are provided. A first entity of a plurality of entities in a distributed computing environment exchanges summary information with a second entity of the plurality of entities via a privacy-preserving data sharing protocol such that the privacy of the summary information is preserved, the summary information associated with an entity relating to data stored at the entity. The first entity may then mine data based on at least the summary information obtained from the second entity via the privacy-preserving data sharing protocol. The first entity may obtain, from the second entity via the privacy-preserving data sharing protocol, information relating to the number of transactions in which a particular itemset occurs and/or information relating to the number of transactions in which a particular rule is satisfied.
9 Citations
19 Claims
-
1. A method of determining at least one large itemset in a privacy-preserving manner in a distributed computing environment including a plurality of entities, comprising the steps of:
-
a first entity of the plurality of entities exchanging summary information with a second entity of the plurality of entities via a privacy-preserving data sharing protocol such that the privacy of the summary information is preserved, the summary information associated with an entity relating to transactional data stored at the entity; and the first entity determining at least one large itemset based on at least the summary information obtained from the second entity via the privacy-preserving data sharing protocol; wherein the summary information exchanging step further comprises; the first entity transmitting a first random number to the second entity; the first entity receiving from the second entity a first result, the first result representing a summation of the first random number and a second random number associated with the second entity; the first entity transmitting a second result to the second entity, the second result representing a summation of the first result and summary information relating to data stored at the first entity; the first entity receiving a third result from the second entity, the third result representing a summation of the second result and summary information relating to data stored at the second entity; the first entity transmitting a fourth result to the second entity, the fourth result representing a subtraction of the first random number from the third result; and the first entity receiving a fifth result from the second entity, the fifth result representing a subtraction of the second random number from the fourth result. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. Apparatus associated with a first entity in a distributed computing environment, including a plurality of entities, for determining at least one large itemset in a privacy-preserving manner, comprising:
-
a memory; and at least one processor coupled to the memory and operative to;
(i) exchange summary information with a second entity of the plurality of entities via a privacy-preserving data sharing protocol such that the privacy of the summary information is preserved, the summary information associated with an entity relating to transactional data stored at the entity; and
(ii) determine at least one large itemset based on at least the summary information obtained from the second entity via the privacy-preserving data sharing protocol;wherein the summary information exchanging operation further comprises; the first entity transmitting a first random number to the second entity; the first entity receiving from the second entity a first result, the first result representing a summation of the first random number and a second random number associated with the second entity; the first entity transmitting a second result to the second entity, the second result representing a summation of the first result and summary information relating to data stored at the first entity; the first entity receiving a third result from the second entity, the third result representing a summation of the second result and summary information relating to data stored at the second entity; the first entity transmitting a fourth result to the second entity, the fourth result representing a subtraction of the first random number from the third result; and the first entity receiving a fifth result from the second entity, the fifth result representing a subtraction of the second random number from the fourth result. - View Dependent Claims (14, 15, 16, 17, 18)
-
-
19. An article of manufacture for use with a first entity in a distributed computing environment, including a plurality of entities, for determining at least one large itemset in a privacy-preserving manner, comprising a machine readable medium containing one or more programs which when executed implement the steps of:
-
the first entity exchanging summary information with a second entity of the plurality of entities via a privacy-preserving data sharing protocol such that the privacy of the summary information is preserved, the summary information associated with an entity relating to transactional data stored at the entity; and the first entity determining at least one large itemset based on at least the summary information obtained from the second entity via the privacy-preserving data sharing protocol; wherein the summary information exchanging step further comprises; the first entity transmitting a first random number to the second entity; the first entity receiving from the second entity a first result, the first result representing a summation of the first random number and a second random number associated with the second entity; the first entity transmitting a second result to the second entity, the second result representing a summation of the first result and summary information relating to data stored at the first entity; the first entity receiving a third result from the second entity, the third result representing a summation of the second result and summary information relating to data stored at the second entity; the first entity transmitting a fourth result to the second entity, the fourth result representing a subtraction of the first random number from the third result; and the first entity receiving a fifth result from the second entity, the fifth result representing a subtraction of the second random number from the fourth result.
-
Specification