PRESERVING GEOMETRIC PROPERTIES OF DATASETS WHILE PROTECTING PRIVACY
First Claim
Patent Images
1. A method comprising:
- receiving a dataset by a computing device;
applying a transformation to the dataset by the computing device to generate a transformed dataset;
adding noise to the transformed dataset by the computing device; and
providing the transformed dataset with the added noise by the computing device.
3 Assignments
0 Petitions
Accused Products
Abstract
The privacy of a dataset is protected. A private dataset is received that includes multiple rows of multidimensional data. Each row may correspond to a user, and each dimension may be an attribute of the user. A projection matrix is applied to each row to generate a lower dimensional sketch of the row. Noise is added to each of the lower dimensional sketches. The sketches with the added noise may be published together with the projection matrix. The sketches preserve geometric relationships of the original dataset including clustering, distances, and nearest neighbor, and therefore may be useful for data mining purposes while still protecting the privacy of the users.
-
Citations
20 Claims
-
1. A method comprising:
-
receiving a dataset by a computing device; applying a transformation to the dataset by the computing device to generate a transformed dataset; adding noise to the transformed dataset by the computing device; and providing the transformed dataset with the added noise by the computing device. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A method comprising:
-
receiving a dataset by a computing device, wherein the dataset comprises a plurality of rows and each row has a first number of dimensions; for each row of the dataset, generating a sketch from the row by the computing device, wherein the sketch has a second number of dimensions that is less than the first number of dimensions; for each sketch, adding noise to the sketch by the computing device; and providing the generated sketches with the added noise by the computing device. - View Dependent Claims (7, 8, 9, 10, 11, 12, 13, 14, 15, 16)
-
-
17. A system comprising:
-
a dataset provider that generates a dataset, wherein the dataset comprises a plurality of rows and each row has a first number of dimensions; and a privacy protector that; receives the generated dataset; and for each row of the generated dataset, generates a sketch from the row, wherein the sketch has a second number of dimensions that is less than the first number of dimensions; and publishes the generated sketches. - View Dependent Claims (18, 19, 20)
-
Specification