Privacy preserving statistical analysis for distributed databases

US 8,893,292 B2
Filed: 11/14/2012
Issued: 11/18/2014
Est. Priority Date: 11/14/2012
Status: Active Grant

First Claim

Patent Images

1. A method for securely determining aggregate statistics on private data, comprising the steps of:

randomizing, in a client, firstly and independently data X and Y to obtain randomized data {circumflex over (X)} and Ŷ

, respectively, wherein the randomizing firstly preserves a privacy of the data X and Y, wherein the randomizing operates directly on the data X and Y, wherein the data X are produced by a first data source, and the data Y are produced by a second data source, and the data X and Y are produced independently in a distributed manner;

randomizing, in the client, secondly independently the randomized data {circumflex over (X)} and Ŷ

to obtain randomized data {tilde over (X)} and {tilde over (Y)} for a server, and helper information T_{{tilde over (X)}|{circumflex over (X)}} and T_{{tilde over (Y)}|Ŷ} for the client, respectively, wherein T represents an empirical distribution, and wherein the randomizing secondly preserves the privacy of the aggregate statistics of the data X and Y;

determining, at the server, T_{{tilde over (X)},{tilde over (Y)}};

applying, by the client, the helper information T_{{tilde over (X)}|{circumflex over (X)}} and T_{{tilde over (Y)}|Ŷ} to T_{{tilde over (X)},{tilde over (Y)}} to obtain an estimated {dot over (T)}_X,Y, wherein “

|” and

“

,”

between X and Y represent a conditional and joint distribution, respectively.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Aggregate statistics are determined by first randomizing independently data X and Y to obtain randomized data {circumflex over (X)} and Ŷ. The first randomizing preserves the privacy of the data X and Y. Then, the randomized data {circumflex over (X)} and Ŷ is randomized secondly to obtain randomized data {tilde over (X)} and {tilde over (Y)} for a server, and helper information T_{{tilde over (X)}|{circumflex over (X)}} and T_Ŷ|Ŷ for a client, wherein T represents an empirical distribution, and wherein the randomizing secondly preserves the privacy of the aggregate statistics of the data X and Y. The server then determines T_{{tilde over (X)},{tilde over (Y)}}. Last, the client applies the side information T_{{tilde over (X)}|{circumflex over (X)}} and T_Ŷ|Ŷ to T_{{tilde over (X)},{tilde over (Y)}} to obtain an estimated {dot over (T)}_X,Y, where “|” and “,” between X and Y represent a conditional and joint distribution, respectively.

21 Citations

View as Search Results

7 Claims

1. A method for securely determining aggregate statistics on private data, comprising the steps of:
- randomizing, in a client, firstly and independently data X and Y to obtain randomized data {circumflex over (X)} and Ŷ
  
  , respectively, wherein the randomizing firstly preserves a privacy of the data X and Y, wherein the randomizing operates directly on the data X and Y, wherein the data X are produced by a first data source, and the data Y are produced by a second data source, and the data X and Y are produced independently in a distributed manner;
  
  randomizing, in the client, secondly independently the randomized data {circumflex over (X)} and Ŷ
  
  to obtain randomized data {tilde over (X)} and {tilde over (Y)} for a server, and helper information T_{{tilde over (X)}|{circumflex over (X)}} and T_{{tilde over (Y)}|Ŷ} for the client, respectively, wherein T represents an empirical distribution, and wherein the randomizing secondly preserves the privacy of the aggregate statistics of the data X and Y;
  
  determining, at the server, T_{{tilde over (X)},{tilde over (Y)}};
  
  applying, by the client, the helper information T_{{tilde over (X)}|{circumflex over (X)}} and T_{{tilde over (Y)}|Ŷ} to T_{{tilde over (X)},{tilde over (Y)}} to obtain an estimated {dot over (T)}_X,Y, wherein “
  
  |” and
  
  “
  
  ,”
  
  between X and Y represent a conditional and joint distribution, respectively.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The method of claim 1, wherein the randomizing uses a Post RAndomisation Method (PRAM).
  - 3. The method of claim 1, wherein the randomizing firstly and secondly are different.
  - 4. The method of claim 1, wherein the helper information is small compared to the data X and Y.
  - 5. The method of claim 1, wherein data X and Y are random sequences, and data pairs (X_i,Y_i) are independently and identically distributed.
  - 6. The method of claim 1, wherein the randomizing preserves differential and distributional privacy of the data X and Y.
  - 7. The method of claim 1, wherein the randomizing secondly provides distributional privacy that is stronger than the differential privacy provided by the randomizing firstly.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Mitsubishi Electric Research Laboratories, Inc. (Mitsubishi Electric Corporation)
Original Assignee
Mitsubishi Electric Research Laboratories, Inc. (Mitsubishi Electric Corporation)
Inventors
Wang, Ye, Lin, Bing-Rong, Rane, Shantanu
Primary Examiner(s)
Armouche, Hadi
Assistant Examiner(s)
KHAN, SHER A

Application Number

US13/676,528
Publication Number

US 20140137260A1
Time in Patent Office

734 Days
Field of Search

726/5, 726/6, 726/14, 726/18, 726/35, 726 26- 30, 713/152, 713/153, 713/156, 713/171, 713/187, 713/193, 713/189, 713/161, 380/24, 380/25, 380/46, 380/49, 380/50, 380/28, 380/173, 380/278, 708/5, 708/8, 708/255, 708/270, 708/422, 708/441, 708/671, 708/200
US Class Current

726/26
CPC Class Codes

G06F 21/60   Protecting data

G06F 21/6245   Protecting personal data, e...

G06F 21/6254   by anonymising data, e.g. d...

Privacy preserving statistical analysis for distributed databases

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

21 Citations

7 Claims

Specification

Solutions

Use Cases

Quick Links

Privacy preserving statistical analysis for distributed databases

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

21 Citations

7 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links