Privacy-Preserving Aggregated Data Mining

US 20140040172A1
Filed: 01/10/2013
Published: 02/06/2014
Est. Priority Date: 01/10/2012
Status: Active Grant

First Claim

Patent Images

1. A method of preserving privacy of data in a dataset in a database with a number n of entries, comprising:

forming a random matrix of dimension m by n, wherein m is less than n;

operating on said dataset with said random matrix to produce a compressed dataset;

forming a pseudoinverse of said random matrix; and

operating on said dataset with said pseudoinverse of said random matrix to produce a decompressed dataset.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

An apparatus, system and method are introduced for preserving privacy of data in a dataset in a database with a number n of entries. In one embodiment, the apparatus includes memory including computer program code configured to, with a processor, cause the apparatus to form a random matrix of dimension m by n, wherein m is less than n, operate on the dataset with the random matrix to produce a compressed dataset, form a pseudoinverse of the random matrix, and operate on the dataset with the pseudoinverse of the random matrix to produce a decompressed dataset.

Citations

20 Claims

1. A method of preserving privacy of data in a dataset in a database with a number n of entries, comprising:
- forming a random matrix of dimension m by n, wherein m is less than n;
  
  operating on said dataset with said random matrix to produce a compressed dataset;
  
  forming a pseudoinverse of said random matrix; and
  
  operating on said dataset with said pseudoinverse of said random matrix to produce a decompressed dataset.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
- - 2. The method as recited in claim 1 further comprising normalizing columns of said random matrix.
  - 3. The method as recited in claim 1 further comprising producing a result of a query on said decompressed dataset.
  - 4. The method as recited in claim 1 wherein elements of said random matrix are formed with random numbers independently generated using a Gaussian probability density.
  - 5. The method as recited in claim 1 wherein said pseudoinverse is a Moore-Penrose pseudoinverse.
  - 6. The method as recited in claim 1 further comprising forming another random matrix if aggregated information of said decompressed dataset is not satisfied.
  - 7. The method as recited in claim 1 further comprising forming another random matrix if a lower privacy bound condition of said decompressed dataset is not satisfied.
  - 8. The method as recited in claim 1 wherein all entries in said random matrix are non-negative.
  - 9. The method as recited in claim 1 wherein said random matrix is a contraction with a Euclidean norm less than unity.
  - 10. The method as recited in claim 1 wherein said dataset is publicly accessible over an Internet.

11. An apparatus operable to preserve privacy of data in a dataset in a database with a number n of entries, comprising:
- a processor, andmemory including computer program code, said memory and said computer program code configured to, with said processor, cause said apparatus to perform at least the following;
  
  form a random matrix of dimension m by n, wherein m is less than n,operate on said dataset with said random matrix to produce a compressed dataset,form a pseudoinverse of said random matrix, andoperate on said dataset with said pseudoinverse of said random matrix to produce a decompressed dataset.
- View Dependent Claims (12, 13, 14, 15)
- - 12. The apparatus as recited in claim 11 wherein said memory and said computer program code are further configured to, with said processor, cause said apparatus to normalize columns of said random matrix.
  - 13. The apparatus as recited in claim 11 wherein said memory and said computer program code are further configured to, with said processor, cause said apparatus to produce a result of a query on said decompressed dataset.
  - 14. The apparatus as recited in claim 11 wherein said memory and said computer program code are further configured to, with said processor, cause said apparatus to form another random matrix if aggregated information of said decompressed dataset is not satisfied.
  - 15. The apparatus as recited in claim 11 wherein said memory and said computer program code are further configured to, with said processor, cause said apparatus to form another random matrix if a lower privacy bound condition of said decompressed dataset is not satisfied.

16. A computer program product operable to preserve privacy of data in a dataset in a database with a number n of entries comprising a program code stored in a computer readable medium configured to:
- form a random matrix of dimension m by n, wherein m is less than n,operate on said dataset with said random matrix to produce a compressed dataset,form a pseudoinverse of said random matrix, andoperate on said dataset with said pseudoinverse of said random matrix to produce a decompressed dataset.
- View Dependent Claims (17, 18, 19, 20)
- - 17. The computer program product as recited in claim 16 wherein said program code stored in said computer readable medium is further configured to normalize columns of said random matrix.
  - 18. The computer program product as recited in claim 16 wherein said program code stored in said computer readable medium is further configured to produce a result of a query on said decompressed dataset.
  - 19. The computer program product as recited in claim 16 wherein said program code stored in said computer readable medium is further configured to form another random matrix if aggregated information of said decompressed dataset is not satisfied.
  - 20. The computer program product as recited in claim 16 wherein said program code stored in said computer readable medium is further configured to form another random matrix if a lower privacy bound condition of said decompressed dataset is not satisfied.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Telcordia Technologies Incorporated (Telefonaktiebolaget LM Ericsson)
Original Assignee
Telcordia Technologies Incorporated (Telefonaktiebolaget LM Ericsson)
Inventors
Ling, Yibei, DiCrescenzo, Giovanni

Granted Patent

US 9,043,250 B2
Time in Patent Office

Days
Field of Search
US Class Current

706/12
CPC Class Codes

G06F 21/6254   by anonymising data, e.g. d...

G06N 20/00   Machine learning

G06N 5/022   Knowledge engineering; Know...

Privacy-Preserving Aggregated Data Mining

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Privacy-Preserving Aggregated Data Mining

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links