Differentially private linear queries on histograms
First Claim
1. A computer system comprising:
- one or more processors; and
one or more computer-readable hardware storage media having stored thereon computer-executable instructions that are executable by the one or more processors to cause the computer system to operate with a computing environment that improves how data is prepared for examination by selectively imposing differential privacy constraints on the data by causing the computer system to;
receive a request to access information included in a dataset that includes confidential information, wherein the dataset has associated therewith a corresponding histogram such that the access request is directed to the dataset'"'"'s histogram, and wherein the access request is received from a source that is not authorized to view an originating identity of the confidential information;
query the dataset to obtain a response to the access request, wherein the response includes a set of collected information, the set of collected information including at least some of the confidential information;
after determining that the response includes the at least some of the confidential information, apply privacy protections to the set of collected information included in the response, wherein applying the privacy protections to the set of collected information alters the set of collected information and causes the originating identity of the confidential information to be hidden; and
after applying the privacy protections to the set of collected information included in the response, return the response to the source of the access request, and in such a manner that the response is viewable without causing the at least some of the confidential information to be discernable.
2 Assignments
0 Petitions
Accused Products
Abstract
The privacy of linear queries on histograms is protected. A database containing private data is queried. Base decomposition is performed to recursively compute an orthonormal basis for the database space. Using correlated (or Gaussian) noise and/or least squares estimation, an answer having differential privacy is generated and provided in response to the query. In some implementations, the differential privacy is ε-differential privacy (pure differential privacy) or is (ε,δ)-differential privacy (i.e., approximate differential privacy). In some implementations, the data in the database may be dense. Such implementations may use correlated noise without using least squares estimation. In other implementations, the data in the database may be sparse. Such implementations may use least squares estimation with or without using correlated noise.
16 Citations
21 Claims
-
1. A computer system comprising:
-
one or more processors; and one or more computer-readable hardware storage media having stored thereon computer-executable instructions that are executable by the one or more processors to cause the computer system to operate with a computing environment that improves how data is prepared for examination by selectively imposing differential privacy constraints on the data by causing the computer system to; receive a request to access information included in a dataset that includes confidential information, wherein the dataset has associated therewith a corresponding histogram such that the access request is directed to the dataset'"'"'s histogram, and wherein the access request is received from a source that is not authorized to view an originating identity of the confidential information; query the dataset to obtain a response to the access request, wherein the response includes a set of collected information, the set of collected information including at least some of the confidential information; after determining that the response includes the at least some of the confidential information, apply privacy protections to the set of collected information included in the response, wherein applying the privacy protections to the set of collected information alters the set of collected information and causes the originating identity of the confidential information to be hidden; and after applying the privacy protections to the set of collected information included in the response, return the response to the source of the access request, and in such a manner that the response is viewable without causing the at least some of the confidential information to be discernable. - View Dependent Claims (2, 3, 4, 5, 6, 7, 21)
-
-
8. A method for operating a computing environment that improves how data is prepared for examination by selectively imposing differential privacy constraints on the data, the method being implemented by one or more processors of a computer system that operates with the computing environment, the method comprising:
-
receiving a request to access information included in a dataset that includes confidential information, wherein the dataset has associated therewith a corresponding histogram such that the access request is directed to the dataset'"'"'s histogram, and wherein the access request is received from a source that is not authorized to view an originating identity of the confidential information; querying the dataset to obtain a response to the access request, wherein the response includes a set of collected information, the set of collected information including at least some of the confidential information; after determining that the response includes the at least some of the confidential information, applying privacy protections to the set of collected information included in the response, wherein applying the privacy protections to the set of collected information alters the set of collected information and causes the originating identity of the confidential information to be hidden; and after applying the privacy protections to the set of collected information included in the response, returning the response to the source of the access request, and in such a manner that the response is viewable without causing the at least some of the confidential information to be discernable. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. One or more hardware storage devices having stored thereon computer-executable instructions that are executable by one or more processors of a computer system to cause the computer system to operate with a computing environment that improves how data is prepared for examination by selectively imposing differential privacy constraints on the data by causing the computer system to:
-
receive a request to access information included in a dataset that includes confidential information, wherein the dataset has associated therewith a corresponding histogram such that the access request is directed to the dataset'"'"'s histogram, and wherein the access request is received from a source that is not authorized to view an originating identity of the confidential information; query the dataset to obtain a response to the access request, wherein the response includes a set of collected information, the set of collected information including at least some of the confidential information; after determining that the response includes the at least some of the confidential information, apply privacy protections to the set of collected information included in the response, wherein applying the privacy protections to the set of collected information alters the set of collected information and causes the originating identity of the confidential information to be hidden; and after applying the privacy protections to the set of collected information included in the response, return the response to the source of the access request, and in such a manner that the response is viewable without causing the at least some of the confidential information to be discernable. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification