Private clustering and statistical queries while analyzing a large database
First Claim
1. A method for preserving privacy of data in a database, comprising:
- providing a query to a database;
receiving an answer from the database in response to the query;
generating an amount of noise using a centered normal distribution, defined by the variance R; and
adding the amount of noise to the answer to result in an obscured answer.
2 Assignments
0 Petitions
Accused Products
Abstract
A database has a plurality of entries and a plurality of attributes common to each entry, where each entry corresponds to an individual. A query is received from a querying entity query and is passed to the database, and an answer is received in response. An amount of noise is generated and added to the answer to result in an obscured answer, and the obscured answer is returned to the querying entity. The noise is normally distributed around zero with a particular variance. The variance R may be determined in accordance with R>8 T log2(T/δ)/ε2, where T is the permitted number of queries T, δ is the utter failure probability, and ε is the largest admissible increase in confidence. Thus, a level of protection of privacy is provided to each individual represented within the database. Example noise generation techniques, systems, and methods may be used for privacy preservation in such areas as k means, principal component analysis, statistical query learning models, and perceptron algorithms.
-
Citations
20 Claims
-
1. A method for preserving privacy of data in a database, comprising:
-
providing a query to a database;
receiving an answer from the database in response to the query;
generating an amount of noise using a centered normal distribution, defined by the variance R; and
adding the amount of noise to the answer to result in an obscured answer. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 20)
-
-
11. A method of preserving privacy, comprising:
-
receiving a portion of an application that references a database;
reformulating the portion of the application as a query; and
running the query pursuant to a sub-linear queries (SuLQ) primitive. - View Dependent Claims (12, 13, 14, 15, 16)
-
-
17. A computer readable medium having computer executable instructions stored thereon for performing a method comprising:
-
receiving an answer in response to the query;
generating an amount of noise using a centered normal distribution, defined by the variance R; and
adding the amount of noise to the answer to result in an obscured answer. - View Dependent Claims (18, 19)
-
Specification