Privacy Preserving Statistical Analysis on Distributed Databases

US 20140281572A1
Filed: 03/14/2013
Published: 09/18/2014
Est. Priority Date: 03/14/2013
Status: Active Grant

First Claim

Patent Images

1. A method for securely determining aggregate statistics on private data, comprising the steps of:

sampling, at one or more clients, data Xⁿand Yⁿto obtain sampled data {tilde over (X)}^mand {tilde over (Y)}^m, wherein m is a sampling parameter substantially smaller than a length n of the data;

encrypting the sampled data {tilde over (X)}^mand {tilde over (Y)}^mto obtain encrypted data {hacek over (X)}^mand {hacek over (Y)}^m;

combining the encrypted data {hacek over (X)}^mand {hacek over (Y)}^mto obtain combined encrypted data;

randomizing the combined encrypted data to obtain randomized data X^m, Y^m;

estimating, at an authorized third-party processor, a joint distribution {circumflex over (T)}_X_n_,Y_nof the data Xⁿand Yⁿfrom the randomized encrypted data X^m, Y^m, such that a differential privacy requirement on the data Xⁿand Yⁿis satisfied.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Aggregate statistics are securely determined on private data by first sampling independent first and second data at one or more clients to obtain sampled data, wherein a sampling parameter substantially smaller than a length of the data. The sampled data are encrypted to obtain encrypted data, which are then combined. The combined encrypted data are randomized to obtain randomized data. At an authorized third-party processor, a joint distribution of the first and second data is estimated from the randomized encrypted data, such that a differential privacy requirement of the first and second is satisfied.

Citations

12 Claims

1. A method for securely determining aggregate statistics on private data, comprising the steps of:
- sampling, at one or more clients, data Xⁿand Yⁿto obtain sampled data {tilde over (X)}^mand {tilde over (Y)}^m, wherein m is a sampling parameter substantially smaller than a length n of the data;
  
  encrypting the sampled data {tilde over (X)}^mand {tilde over (Y)}^mto obtain encrypted data {hacek over (X)}^mand {hacek over (Y)}^m;
  
  combining the encrypted data {hacek over (X)}^mand {hacek over (Y)}^mto obtain combined encrypted data;
  
  randomizing the combined encrypted data to obtain randomized data X^m, Y^m;
  
  estimating, at an authorized third-party processor, a joint distribution {circumflex over (T)}_X_n_,Y_nof the data Xⁿand Yⁿfrom the randomized encrypted data X^m, Y^m, such that a differential privacy requirement on the data Xⁿand Yⁿis satisfied.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 11, 12)
- - 2. The method of claim 1, wherein the sampling and encrypting of the data Xⁿand Yⁿis performed by the one or more client processors, the combining and randomizing are performed by the server processor, and the estimating is performed by the authorized third-party processor.
  - 3. The method of claim 1, wherein the encryption is performed before the sampling at the client processor.
  - 4. The method of claim 1, whereinthe randomizing is performed by the client processor.
  - 5. The method of claim 1, wherein the encrypted data {hacek over (X)}^mand {hacek over (Y)}^mis obtained from the sampled data {tilde over (X)}^mand {tilde over (Y)}^musing a stream cipher, and decryption parameters for the stream cipher are provided to the authorized third-party processor.
  - 6. The method of claim 1, wherein the randomized data X^m, Y^mis obtained from the encrypted data {hacek over (X)}^mand {hacek over (Y)}^musing a randomized response mechanism.
  - 7. The method of claim 5, wherein the estimating further comprises:
    - reversing, at the authorized third-party processor, the encryption applied to the randomized data X^m, Y^m, using the decryption parameters provided to the authorized third-party processor by the one or more client processors.
  - 8. The method of claim 1, wherein the one or more client processors perform the sampling, encrypting and randomizing, and the third party processor performs the combining and estimating.
  - 9. The method of claim 1, wherein the sampling, randomizing and encrypting can be in any order as long as the encrypting parameters are determined by one or more client processors, and the decrypting parameters are provided to the authorized third-party processor.
  - 11. The method of claim 1, wherein the randomized data is obtained for the encrypted data using a randomized response mechanism wherein the randomized response mechanism further comprises:
    - independently altering each data element to any other values in the alphabet set with the same probability, or alternatively retaining the value of that element.
  - 12. The method of claim 1, wherein one or more client processors perform the encrypting, and the server processor perform the combining, sampling, and randomizing.

10. A method for securely determining aggregate statistics on private data, comprising the steps of:
- sampling, at one or more client processors, first data and second data to obtain sampled data, wherein a sampling parameter is substantially smaller than a length of the data;
  
  encrypting the sampled data to obtain encrypted data;
  
  combining the encrypted data to obtain combined encrypted data;
  
  randomizing the combined encrypted data to obtain randomized data;
  
  estimating, at an authorized third-party processor, a joint distribution of the first data and the second data from the randomized encrypted data such that a differential privacy requirement of the first data and the seconds data is satisfied.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Mitsubishi Electric Research Laboratories, Inc. (Mitsubishi Electric Corporation)
Original Assignee
Mitsubishi Electric Research Laboratories, Inc. (Mitsubishi Electric Corporation)
Inventors
Wang, Ye, Lin, Bing-Rong, Rane, Shantanu

Granted Patent

US 10,146,958 B2
Time in Patent Office

Days
Field of Search
US Class Current

713/189
CPC Class Codes

G06F 21/6254   by anonymising data, e.g. d...

H04L 2209/42   Anonymization, e.g. involvi...

H04L 9/00   Cryptographic mechanisms or...

H04L 9/0656   Pseudorandom key sequence c...

Privacy Preserving Statistical Analysis on Distributed Databases

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

12 Claims

Specification

Solutions

Use Cases

Quick Links

Privacy Preserving Statistical Analysis on Distributed Databases

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

12 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links