Privacy preserving statistical analysis for distributed databases
First Claim
1. A method for securely determining aggregate statistics on private data, comprising the steps of:
- randomizing, in a client, firstly and independently data X and Y to obtain randomized data {circumflex over (X)} and Ŷ
, respectively, wherein the randomizing firstly preserves a privacy of the data X and Y, wherein the randomizing operates directly on the data X and Y, wherein the data X are produced by a first data source, and the data Y are produced by a second data source, and the data X and Y are produced independently in a distributed manner;
randomizing, in the client, secondly independently the randomized data {circumflex over (X)} and Ŷ
to obtain randomized data {tilde over (X)} and {tilde over (Y)} for a server, and helper information T{tilde over (X)}|{circumflex over (X)} and T{tilde over (Y)}|Ŷ
for the client, respectively, wherein T represents an empirical distribution, and wherein the randomizing secondly preserves the privacy of the aggregate statistics of the data X and Y;
determining, at the server, T{tilde over (X)},{tilde over (Y)};
applying, by the client, the helper information T{tilde over (X)}|{circumflex over (X)} and T{tilde over (Y)}|Ŷ
to T{tilde over (X)},{tilde over (Y)} to obtain an estimated {dot over (T)}X,Y, wherein “
|” and
“
,”
between X and Y represent a conditional and joint distribution, respectively.
1 Assignment
0 Petitions
Accused Products
Abstract
Aggregate statistics are determined by first randomizing independently data X and Y to obtain randomized data {circumflex over (X)} and Ŷ. The first randomizing preserves the privacy of the data X and Y. Then, the randomized data {circumflex over (X)} and Ŷ is randomized secondly to obtain randomized data {tilde over (X)} and {tilde over (Y)} for a server, and helper information T{tilde over (X)}|{circumflex over (X)} and TŶ|Ŷ for a client, wherein T represents an empirical distribution, and wherein the randomizing secondly preserves the privacy of the aggregate statistics of the data X and Y. The server then determines T{tilde over (X)},{tilde over (Y)}. Last, the client applies the side information T{tilde over (X)}|{circumflex over (X)} and TŶ|Ŷ to T{tilde over (X)},{tilde over (Y)} to obtain an estimated {dot over (T)}X,Y, where “|” and “,” between X and Y represent a conditional and joint distribution, respectively.
21 Citations
7 Claims
-
1. A method for securely determining aggregate statistics on private data, comprising the steps of:
-
randomizing, in a client, firstly and independently data X and Y to obtain randomized data {circumflex over (X)} and Ŷ
, respectively, wherein the randomizing firstly preserves a privacy of the data X and Y, wherein the randomizing operates directly on the data X and Y, wherein the data X are produced by a first data source, and the data Y are produced by a second data source, and the data X and Y are produced independently in a distributed manner;randomizing, in the client, secondly independently the randomized data {circumflex over (X)} and Ŷ
to obtain randomized data {tilde over (X)} and {tilde over (Y)} for a server, and helper information T{tilde over (X)}|{circumflex over (X)} and T{tilde over (Y)}|Ŷ
for the client, respectively, wherein T represents an empirical distribution, and wherein the randomizing secondly preserves the privacy of the aggregate statistics of the data X and Y;determining, at the server, T{tilde over (X)},{tilde over (Y)}; applying, by the client, the helper information T{tilde over (X)}|{circumflex over (X)} and T{tilde over (Y)}|Ŷ
to T{tilde over (X)},{tilde over (Y)} to obtain an estimated {dot over (T)}X,Y, wherein “
|” and
“
,”
between X and Y represent a conditional and joint distribution, respectively. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
Specification