Differentially private processing and database storage
First Claim
1. A method for returning differentially private results in response to a query to a database storing restricted health data for a plurality of patients, the database storing records comprising rows and columns, where the rows are associated with patients having a medical condition, and columns of the rows contain values describing health data for the patients, the method comprising:
- receiving a database query from a client device, the database query requesting a random forest classifier correlating values of columns in a set of records in the database with medical conditions associated with the rows, wherein rows in the database are labeled with medical conditions from a set of two or more medical conditions and the database query specifies a degree of privacy to maintain for the restricted data in terms of a privacy parameter ϵ
describing a degree of information released about a set of data stored in the private database system due to the query;
performing the database query on the set of records to produce a differentially private version of the random forest classifier that maintains the specified degree of privacy for the restricted data, performing the query comprising;
training the random forest classifier upon the values of columns in the set of records and the medical conditions of the labeled rows, wherein the random forest classifier comprises a set of decision trees, each decision tree having one or more leaf nodes, and each leaf node indicating a relative proportion of rows labeled with each category in the leaf node; and
producing a differentially private version of the random forest classifier by perturbing relative proportions of rows labeled with each category in each leaf node by;
4 Assignments
0 Petitions
Accused Products
Abstract
A hardware database privacy device is communicatively coupled to a private database system. The hardware database privacy device receives a request from a client device to perform a query of the private database system and identifies a level of differential privacy corresponding to the request. The identified level of differential privacy includes privacy parameters (ε,δ) indicating the degree of information released about the private database system. The hardware database privacy device identifies a set of operations to be performed on the set of data that corresponds to the requested query. After the set of data is accessed, the set of operations is modified based on the identified level of differential privacy such that a performance of the modified set of operations produces a result set that is (ε,δ)-differentially private.
26 Citations
20 Claims
-
1. A method for returning differentially private results in response to a query to a database storing restricted health data for a plurality of patients, the database storing records comprising rows and columns, where the rows are associated with patients having a medical condition, and columns of the rows contain values describing health data for the patients, the method comprising:
-
receiving a database query from a client device, the database query requesting a random forest classifier correlating values of columns in a set of records in the database with medical conditions associated with the rows, wherein rows in the database are labeled with medical conditions from a set of two or more medical conditions and the database query specifies a degree of privacy to maintain for the restricted data in terms of a privacy parameter ϵ
describing a degree of information released about a set of data stored in the private database system due to the query;performing the database query on the set of records to produce a differentially private version of the random forest classifier that maintains the specified degree of privacy for the restricted data, performing the query comprising; training the random forest classifier upon the values of columns in the set of records and the medical conditions of the labeled rows, wherein the random forest classifier comprises a set of decision trees, each decision tree having one or more leaf nodes, and each leaf node indicating a relative proportion of rows labeled with each category in the leaf node; and producing a differentially private version of the random forest classifier by perturbing relative proportions of rows labeled with each category in each leaf node by; - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A non-transitory computer-readable storage medium storing computer program instructions executable by a processor to perform operations for returning differentially private results in response to a query to a database storing restricted health data for a plurality of patients, the database storing records comprising rows and columns, where the rows are associated with patients having a medical condition, and columns of the rows contain values describing health data for the patients, the operations comprising:
-
receiving a database query from a client device, the database query requesting a random forest classifier correlating values of columns in a set of records in the database with medical conditions associated with the rows, wherein rows in the database are labeled with medical conditions from a set of two or more medical conditions and the database query specifies a degree of privacy to maintain for the restricted data in terms of a privacy parameter ϵ
describing a degree of information released about a set of data stored in the private database system due to the query;performing the database query on the set of records to produce a differentially private version of the random forest classifier that maintains the specified degree of privacy for the restricted data, performing the query comprising; training the random forest classifier upon the values of columns in the set of records and the medical conditions of the labeled rows, wherein the random forest classifier comprises a set of decision trees, each decision tree having one or more leaf nodes, and each leaf node indicating a relative proportion of rows labeled with each category in the leaf node; and producing a differentially private version of the random forest classifier by perturbing relative proportions of rows labeled with each category in each leaf node by; - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A system comprising:
-
a processor for executing computer program instructions; and a non-transitory computer-readable storage medium storing computer program instructions executable by the processor to perform operations for returning differentially private results in response to a query to a database storing restricted health data for a plurality of patients, the database storing records comprising rows and columns, where the rows are associated with patients having a medical condition, and columns of the rows contain values describing health data for the patients, the operations comprising; receiving a database query from a client device, the database query requesting a random forest classifier correlating values of columns in a set of records in the database with medical conditions associated with the rows, wherein rows in the database are labeled with medical conditions from a set of two or more medical conditions and the database query specifies a degree of privacy to maintain for the restricted data in terms of a privacy parameter ϵ
describing a degree of information released about a set of data stored in the private database system due to the query;performing the database query on the set of records to produce a differentially private version of the random forest classifier that maintains the specified degree of privacy for the restricted data, performing the query comprising; training the random forest classifier upon the values of columns in the set of records and the medical conditions of the labeled rows, wherein the random forest classifier comprises a set of decision trees, each decision tree having one or more leaf nodes, and each leaf node indicating a relative proportion of rows labeled with each category in the leaf node; and producing a differentially private version of the random forest classifier by perturbing relative proportions of rows labeled with each category in each leaf node by; - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification