Differentially private aggregate classifier for multiple databases

0Associated
Cases 
0Associated
Defendants 
0Accused
Products 
33Forward
Citations 
0
Petitions 
1
Assignment
First Claim
1. A method for determining a differentially private aggregate classifier for a set of databases, wherein each database in the set of databases is associated with a classifier and a noise value, wherein the classifier and the noise value are determined locally for each database, such that a combination of the classifier and the noise value ensure a differential data privacy of the database, and wherein the differentially private aggregate classifier preserves the differential data privacy of each database, comprising the steps of:
 combining classifiers to determine an aggregate classifier; and
modifying the aggregate classifier with a noise value corresponding to a smallest database in the set of databases to produce the differentially private aggregate classifier, wherein the steps of the method are performed by a processor.
1 Assignment
0 Petitions
Accused Products
Abstract
Embodiments of the invention disclose a system and a method for determining a differentially private aggregate classifier for a set of databases, wherein each database in the set of databases is associated with a classifier and a noise value, wherein the classifier and the noise value are determined locally for each database, such that a combination of the classifier and the noise value ensure a differential data privacy of the database, and wherein the differentially private aggregate classifier preserves the differential data privacy of each database. The differentially private aggregate classifier is a combination of the classifiers of the set of databases modified with the noise value corresponding to a smallest database in the set of databases.
33 Citations
PRIVACYPRESERVING AGGREGATION OF TIMESERIES DATA  
Patent #
US 20120204026A1
Filed 02/04/2011

Current Assignee
Palo Alto Research Center Inc.

Sponsoring Entity
Palo Alto Research Center Inc.

System and Method of Securing Private Health Information  
Patent #
US 20130086390A1
Filed 10/22/2011

Current Assignee
Aaron Michael Lewis, Todd Michael Kennedy

Sponsoring Entity
Aaron Michael Lewis, Todd Michael Kennedy

Privacypreserving aggregation of Timeseries data  
Patent #
US 8,555,400 B2
Filed 02/04/2011

Current Assignee
Palo Alto Research Center Inc.

Sponsoring Entity
Palo Alto Research Center Inc.

METHOD AND APPARATUS FOR NEARLY OPTIMAL PRIVATE CONVOLUTION  
Patent #
US 20150286827A1
Filed 11/27/2013

Current Assignee
Nadia Fawaz, Aleksandar Todorov Nikolov, Thomson Licensing

Sponsoring Entity
Nadia Fawaz, Aleksandar Todorov Nikolov, Thomson Licensing

Method for determining hidden states of systems using privacypreserving distributed data analytics  
Patent #
US 9,246,978 B2
Filed 11/11/2013

Current Assignee
Mitsubishi Electric Research Laboratories

Sponsoring Entity
Mitsubishi Electric Research Laboratories

Method and system for determining hidden states of a machine using privacypreserving distributed data analytics and a semitrusted server and a thirdparty  
Patent #
US 9,471,810 B2
Filed 03/09/2015

Current Assignee
Mitsubishi Electric Research Laboratories

Sponsoring Entity
Mitsubishi Electric Research Laboratories

Differentially private linear queries on histograms  
Patent #
US 9,672,364 B2
Filed 03/15/2013

Current Assignee
Microsoft Technology Licensing LLC

Sponsoring Entity
Microsoft Technology Licensing LLC

PRIVACYAWARE QUERY MANAGEMENT SYSTEM  
Patent #
US 20170169253A1
Filed 12/10/2015

Current Assignee
Neustar Incorporated

Sponsoring Entity
Neustar Incorporated

Emoji frequency detection and deep link frequency  
Patent #
US 9,705,908 B1
Filed 09/24/2016

Current Assignee
Apple Inc.

Sponsoring Entity
Apple Inc.

Emoji frequency detection and deep link frequency  
Patent #
US 9,712,550 B1
Filed 09/24/2016

Current Assignee
Apple Inc.

Sponsoring Entity
Apple Inc.

Emoji frequency detection and deep link frequency  
Patent #
US 9,894,089 B2
Filed 06/19/2017

Current Assignee
Apple Inc.

Sponsoring Entity
Apple Inc.

Privacyaware query management system  
Patent #
US 10,108,818 B2
Filed 12/10/2015

Current Assignee
Neustar Incorporated

Sponsoring Entity
Neustar Incorporated

Differentially private linear queries on histograms  
Patent #
US 10,121,024 B2
Filed 05/04/2017

Current Assignee
Microsoft Technology Licensing LLC

Sponsoring Entity
Microsoft Technology Licensing LLC

Learning new words  
Patent #
US 10,133,725 B2
Filed 04/03/2017

Current Assignee
Apple Inc.

Sponsoring Entity
Apple Inc.

Emoji frequency detection and deep link frequency  
Patent #
US 10,154,054 B2
Filed 06/30/2017

Current Assignee
Apple Inc.

Sponsoring Entity
Apple Inc.

Differentially private processing and database storage  
Patent #
US 10,192,069 B2
Filed 07/07/2016

Current Assignee
LeapYear Technologies Inc.

Sponsoring Entity
LeapYear Technologies Inc.

Method for differentially private aggregation in a star topology under a realistic adversarial model  
Patent #
US 10,223,547 B2
Filed 10/11/2016

Current Assignee
Palo Alto Research Center Inc.

Sponsoring Entity
Palo Alto Research Center Inc.

Differentially private processing and database storage  
Patent #
US 10,229,287 B2
Filed 10/25/2017

Current Assignee
LeapYear Technologies Inc.

Sponsoring Entity
LeapYear Technologies Inc.

Efficient implementation for differential privacy using cryptographic functions  
Patent #
US 10,229,282 B2
Filed 09/23/2016

Current Assignee
Apple Inc.

Sponsoring Entity
Apple Inc.

Differentially private processing and database storage  
Patent #
US 10,242,224 B2
Filed 10/25/2017

Current Assignee
LeapYear Technologies Inc.

Sponsoring Entity
LeapYear Technologies Inc.

Tracking privacy budget with distributed ledger  
Patent #
US 10,380,366 B2
Filed 04/25/2017

Current Assignee
SAP SE

Sponsoring Entity
SAP SE

Differentially private database permissions system  
Patent #
US 10,430,605 B1
Filed 11/29/2018

Current Assignee
LeapYear Technologies Inc.

Sponsoring Entity
LeapYear Technologies Inc.

Emoji frequency detection and deep link frequency  
Patent #
US 10,454,962 B2
Filed 10/12/2018

Current Assignee
Apple Inc.

Sponsoring Entity
Apple Inc.

Differentially private database queries involving rank statistics  
Patent #
US 10,467,234 B2
Filed 07/19/2018

Current Assignee
LeapYear Technologies Inc.

Sponsoring Entity
LeapYear Technologies Inc.

Differentially private density plots  
Patent #
US 10,489,605 B2
Filed 04/23/2018

Current Assignee
LeapYear Technologies Inc.

Sponsoring Entity
LeapYear Technologies Inc.

Efficient implementation for differential privacy using cryptographic functions  
Patent #
US 10,552,631 B2
Filed 03/08/2019

Current Assignee
Apple Inc.

Sponsoring Entity
Apple Inc.

Differentially private processing and database storage  
Patent #
US 10,586,068 B2
Filed 01/02/2019

Current Assignee
LeapYear Technologies Inc.

Sponsoring Entity
LeapYear Technologies Inc.

User experience using privatized crowdsourced data  
Patent #
US 10,599,867 B2
Filed 11/07/2017

Current Assignee
Apple Inc.

Sponsoring Entity
Apple Inc.

User experience using privatized crowdsourced data  
Patent #
US 10,599,868 B2
Filed 11/07/2017

Current Assignee
Apple Inc.

Sponsoring Entity
Apple Inc.

Differentially private budget tracking using Renyi divergence  
Patent #
US 10,642,847 B1
Filed 05/09/2019

Current Assignee
LeapYear Technologies Inc.

Sponsoring Entity
LeapYear Technologies Inc.

Learning new words  
Patent #
US 10,701,042 B2
Filed 10/12/2018

Current Assignee
Apple Inc.

Sponsoring Entity
Apple Inc.

Differential privacy using a multibit histogram  
Patent #
US 10,726,139 B2
Filed 09/30/2017

Current Assignee
Apple Inc.

Sponsoring Entity
Apple Inc.

Differentially private machine learning using a random forest classifier  
Patent #
US 10,726,153 B2
Filed 09/27/2018

Current Assignee
LeapYear Technologies Inc.

Sponsoring Entity
LeapYear Technologies Inc.

No References
15 Claims
 1. A method for determining a differentially private aggregate classifier for a set of databases, wherein each database in the set of databases is associated with a classifier and a noise value, wherein the classifier and the noise value are determined locally for each database, such that a combination of the classifier and the noise value ensure a differential data privacy of the database, and wherein the differentially private aggregate classifier preserves the differential data privacy of each database, comprising the steps of:
combining classifiers to determine an aggregate classifier; and modifying the aggregate classifier with a noise value corresponding to a smallest database in the set of databases to produce the differentially private aggregate classifier, wherein the steps of the method are performed by a processor.  View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
 10. A system for determining a differentially private aggregate classifier for a set of databases, wherein each database in the set of databases is associated with a classifier and a noise value, wherein the classifier and the noise value are determined locally for each database, such that a combination of the classifier and the noise value ensure a differential data privacy of the database, and wherein the differentially private aggregate classifier preserves the differential data privacy of each database, comprising:
means for combining classifiers to determine an aggregate classifier; and means for modifying the aggregate classifier with a noise value corresponding to a smallest database in the set of databases to produce the differentially private aggregate classifier.  View Dependent Claims (11, 12, 13, 14)
 15. A computer readable medium storing a differentially private aggregate classifier for a set of databases, wherein each database in the set of databases is associated with a classifier and a noise value, wherein the classifier and the noise value are determined locally for each database, such that a combination of the classifier and the noise value ensure a differential data privacy of the database, wherein the differentially private aggregate classifier is a combination of the classifiers of the set of databases modified with the noise value corresponding to a smallest database in the set of databases.
1 Specification
The invention relates generally to differential data privacy, and more particularly to determining a differentially private aggregate classifier for multiple databases.
Data collection provides information for a wide variety of academic, industrial, business, and government purposes. For example, data collection is necessary for sociological studies, market research, and in a census. To maximize the utility of collected data, all data can be amassed and made available for analysis without any privacy controls. Of course, most people and organizations (“privacy principals”) are unwilling to disclose all data, especially when data are easily exchanged and could be accessed by unauthorized persons. Privacy guarantees can improve the willingness of privacy principals to contribute their data, as well as to reduce fraud, identity theft, extortion, and other problems that can arise from sharing data without adequate privacy protection.
A method for preserving privacy is to compute collective results of queries performed over collected data, and disclose such collective results without disclosing the inputs of the participating privacy principals. For example, a medical database might be queried to determine how many people in the database are HIV positive. The total number of people that are HIV positive can be disclosed without disclosing the names of the individuals that are HIV positive. Useful data are thus extracted while ostensibly preserving the privacy of the principals to some extent.
However, adversaries might apply a variety of techniques to predict or narrow down the set of individuals from the medical database who are likely to be HIV positive. For example, an adversary might run another query that asks how many people both have HIV and are not named John Smith. The adversary may then subtract the second query output from the first, and thereby learn the HIV status of John Smith without ever directly asking the database for a name of a privacy principal. With sensitive data, it is useful to provide verifiable privacy guarantees. For example, it would be useful to verifiably guarantee that nothing more can be gleaned about any specific privacy principal than was known at the outset.
Adding noise to a query output can enhance the privacy of the principals. Using the example above, some random number might be added to the disclosed number of HIV positive principals. The noise will decrease the accuracy of the disclosed output, but the corresponding gain in privacy may warrant this loss.
The concept of adding noise to a query result to preserve the privacy of the principals is generally known. One method uses differentially private classifiers for protecting the privacy of individual data instances using added noise. A classifier evaluated over a database is said to satisfy differential privacy if the probability of the classifier producing a particular output is almost the same regardless of the presence or absence of any individual data instance in the database.
However, the conventional differentially private classifiers are determined locally for each database and fail to provide privacy when there is a requirement to use those classifiers over multiple databases. Accordingly, there is a need to determine such a classifier for a set of databases that preserves the differential data privacy of each database.
Differential privacy provides statistical guarantees that the output of a classifier does not include information about individual data instances. However, in multiparty applications, data for determining a classifier are distributed across several databases, and conventional differential privacy methods do not preserve differential data privacy for multiple contributing parties.
This is because conventional methods are inherently designed for the case where the classifier is determined based on access to the entire data of the database and is modified by the noise value computed over the data to produce differentially private classifier specifically for that data. However, in multiparty applications, it is often impossible to access the data of different database due to security constraints.
Embodiments of the invention are based on a realization that for a set of databases, a differentially private aggregate classifier preserving the differential data privacy of each database can be determined from the classifiers and the noise values of individual databases in the set of databases, without allowing access to the data of the databases.
However, in multiparty applications, adding the noise value to the classifier is no longer straightforward. This is because there is no guarantee that the added noise results in the differential data privacy of each database. For example, the aggregation of all classifiers and the noise values, which would be considered as a logical approach, does not satisfy differential privacy for the combined data.
Embodiments of the invention are based on another realization that a differentially private aggregate classifier can be determined as an aggregation of classifiers of each database modified by a noise value corresponding to a smallest database in the set of databases. The smallest database has the smallest number of entries, wherein a data structure of each entry is the same across all databases. The proof for correctness of this realization is provided in the Appendix.
Accordingly, one embodiment of the invention discloses a method for determining a differentially private aggregate classifier for a set of databases, wherein each database in the set of databases is associated with a classifier and a noise value, wherein the classifier and the noise value are determined locally for each database, such that a combination of the classifier and the noise value ensure differential data privacy of the database, and wherein the differentially private aggregate classifier preserves the differential data privacy of each database, comprising the steps of: combining classifiers of each database to determine an aggregate classifier; and modifying the aggregate classifier with a noise value corresponding to a smallest database, wherein the smallest database has the smallest number of entries, wherein a data structure of each entry is the same for all databases.
Moreover, various embodiments of the invention determine the differentially private aggregate classifier securely using cryptographic protocols. Those embodiments ensure that the data of each database are not shared with any other party and the differentially private aggregate classifier cannot be reverse engineered to learn about any individual data instances of any database.
Another embodiment discloses a system for determining a differentially private aggregate classifier for a set of databases, wherein each database in the set of databases is associated with a classifier and a noise value, wherein the classifier and the noise value are determined locally for each database, such that a combination of the classifier and the noise value ensure a differential data privacy of the database, and wherein the differentially private aggregate classifier preserves the differential data privacy of each database, comprising: means for combining classifiers to determine an aggregate classifier; and means for modifying the aggregate classifier with a noise value corresponding to a smallest database in the set of databases to produce the differentially private aggregate classifier.
Yet another embodiment discloses a computer readable medium storing a differentially private aggregate classifier for a set of databases, wherein each database in the set of databases is associated with a classifier and a noise value, wherein the classifier and the noise value are determined locally for each database, such that a combination of the classifier and the noise value ensure a differential data privacy of the database, wherein the differentially private aggregate classifier is a combination of the classifiers of the set of databases modified with the noise value corresponding to a smallest database in the set of databases.
In describing embodiments of the invention, the following definitions are applicable throughout (including above).
A “computer” refers to any apparatus that is capable of accepting a structured input, processing the structured input according to prescribed rules, and producing results of the processing as output. Examples of a computer include a computer; a generalpurpose computer; a supercomputer; a mainframe; a super minicomputer; a minicomputer; a workstation; a microcomputer; a server; an interactive television; a hybrid combination of a computer and an interactive television; and applicationspecific hardware to emulate a computer and/or software. A computer can have a single processor or multiple processors, which can operate in parallel and/or not in parallel. A computer also refers to two or more computers connected together via a network for transmitting or receiving information between the computers. An example of such a computer includes a distributed computer system for processing information via computers linked by a network.
A “central processing unit (CPU)” or a “processor” refers to a computer or a component of a computer that reads and executes software instructions.
A “memory” or a “computerreadable medium” refers to any storage for storing data accessible by a computer. Examples include a magnetic hard disk; a floppy disk; an optical disk, like a CDROM or a DVD; a magnetic tape; a memory chip; and a carrier wave used to carry computerreadable electronic data, such as those used in transmitting and receiving email or in accessing a network, and a computer memory, e.g., randomaccess memory (RAM).
“Software” refers to prescribed rules to operate a computer. Examples of software include software; code segments; instructions; computer programs; and programmed logic. Software of intelligent systems may be capable of selflearning.
A “module” or a “unit” refers to a basic component in a computer that performs a task or part of a task. It can be implemented by either software or hardware.
As shown in
Each database 120130 in the set of databases 110 is associated with a classifier, e.g., the classifiers 121 and 131 and a noise value, e.g., noise values 122 and 132. For example, the databases are denoted as D_{1}, . . . , D_{K}, where D_{i}=(x,y)_{j }includes a set of entries x and corresponding binary labels y. The classifier and the noise value are determined locally for each database, such that a combination of the classifier and the noise value ensure a differential data privacy 125 or 135 of the database. The determined locally means that the classifiers and the noise values are determined independently by each owner of the database before or concurrently with the execution of the method 100.
Typically, to ensure the differential privacy, the noise value is determined over the entire data entries of the database in dependence with a size 123 or 133 of the database. As used herein, the size of the database is based on a number of entries. The entry can have any data structure. However the data structure of each entry is the same across all databases. Examples of the entry are a field, a row in a table, a table itself, a file, or another database.
Embodiments of the invention are based on a realization that a differentially private aggregate classifier from the union of the databases D_{1}∪D_{2 }. . . ∪D_{K }can be determined as an aggregation of classifiers of each database modified by a noise value corresponding to the smallest database in the set of databases. The smallest database has the smallest number of entries, wherein a data structure of each entry is the same for all databases. The proof for correctness of this realization is provided in the Appendix A.
Accordingly, one embodiment combines 140 classifiers of each database to determine an aggregate classifier 145. Additionally, the embodiments determines 150 the noise value 155 corresponding to the smallest database. Next, the aggregate classifier 145 is modified 160 with the noise value 155 to produce the differentially private aggregate classifier 170. The differentially private aggregate classifier is published, e.g., stored in a memory 175 or distributed over internet.
Differential Privacy
According to the definition of a differential privacy model, given any two databases D and D′ differing by one element, i.e., adjacent databases, a classifier defined by a randomized query function M is differentially private, if the probability that the function M produces a response S on the database D is similar to the probability that the function M produces the same response S on the database D′. As the query output is almost the same in the presence or absence of an individual entry with high probability, almost nothing can be learned about any individual entry from the output.
The randomized function M with a welldefined probability density P satisfies εdifferential privacy if, for all adjacent databases D and D′ and for any Sεrange(M),
Accordingly, the differentially private classifier guaranties that no additional details about the individual entries can be obtained with certainty from output of the learning algorithm, beyond the a priori background knowledge. Differential privacy provides an ad omnia guarantee as opposed to most other models that provide ad hoc guarantees against a specific set of attacks and adversarial behaviors. By evaluating the differentially private classifier over a large number of entries, an adversary cannot learn the exact form of the data.
A differential diameter 240 and privacy parameter ε can be used in calculating each of the distributions in
Typically, the classifiers are designed to be differentially private by adding a noise value to the weights of the classifier, where the noise value is selected from the distribution described above. Further, parameters of the distribution depend on a degree of desired privacy expressed by the epsilons ε, which usually depends on the size of the database, and on the type of the function of the classifier, e.g., average, or maximum or logarithm function. In one embodiment, the noise values have a Laplace distribution.
Determining Classifiers Locally on Individual Databases
Each database owner P_{j }uses its database (x, y)_{j }to determine the classifier with weights w_{j}, wherein j is an index of the database. One embodiment uses a regularized logistic regression function l_{2 }for the classifiers. For example, the classifier can be determined by minimizing the following objective function
where λ>0 is a regularization parameter, and T is a transpose operator. However, the classifiers are determined Focally for each individual database and no data or information are shared.
Example of Differentially Private Aggregate Classifier
One embodiment of the invention defines the differentially private aggregate classifier w^{s }170 according to
where K is a number of the databases in the set of databases, j is the index of the database, and η is a ddimensional random variable sampled from a Laplace (Lap) distribution scaled with a parameter
n_{(1) }is the noise value corresponding to the smallest database, i.e., n_{(1)}=min_{j}n_{j}, λ is the parameter of the Laplace distribution, and ε is the differential privacy parameter.
The differentially private aggregate classifier w^{s }incurs only a wellbounded excess risk over training a classifier directly on the union of all data while enabling the parties to maintain their privacy. The noise value η ensures that the classifier w^{s }satisfies differential privacy, i.e., that individual data instances cannot be discerned from the classifier.
The definition of the noise value η above is not intuitive, but we have proved that differentially private aggregate classifier constructed by aggregating locally trained classifiers is limited by the performance of the individual classifier that has the least number of entries.
Some embodiments of the invention are based on the realization that the owners of the databases P_{j }cannot simply take their locally trained classifiers w_{j}, perturb them with a noise vector and publish the perturbed classifiers, because aggregating such classifiers will not give the correct noise value η: Lap(2/(n_{(1)}ελ)) that ensures differential privacy. Also, because individual database owners cannot simply add noise to their classifiers to impose differential privacy for all other classifiers, the actual averaging operation must be performed such that the individual classifiers or the number of entries in each database are not exposed. Accordingly, some embodiments use a secure multiparty computation (SMC) method for interacting with a processor to perform the averaging. The outcome of the method is such that each of the database owners obtains additive shares of the desired differentially private classifier w^{s}, such that these shares must be added to obtain the differentially private aggregate classifier.
Secure Multiparty Computation (SMC) Method
The embodiments use asymmetric key additively homomorphic encryption. A desirable property of such encryption is that operations performed on ciphertext elements maps into known operations on the same plaintext elements. For an additively homomorphic encryption function ξ(•), that means that for any a and b ξ(a)ξ(b)=ξ(a+b), ξ(a)^{b}=ξ(ab).
The additively homomorphic encryption is semantically secure, i.e., repeated encryption of the same plaintext will result in different cyphertexts. For the SMC method, encryption keys are considered public and decryption keys are privately owned by the specified database owners.
Determining an Obfuscated Index of the Smallest Database
The processor determines 310 an obfuscated index 315 of a smallest database based on permuted indexes 320 resulting from a permutation of indexes of the databases. For example, each database owner, i.e., a party P_{j}, computes n_{j}=a_{j}+b_{j}, where a_{j }and b_{j }are integers representing additive shares of the database lengths n_{j }for j=1, 2, . . . , K. The Klength vectors of additive shares are defined as a and b, respectively.
The parties P_{j }mutually agree on a permutation π_{1 }on the index vector (1, 2, . . . , K). This permutation is unknown to the processor. Then, each party P_{j }transmits its share a_{j }to a representative party P_{π}_{1}_{(j)}, and transmits its share b_{j }to the processor with the index changed according to the permutation. Thus, after this step, the parties have permuted additive shares given by π_{1}(a) while the processor has permuted additive shares π_{1}(b).
The parties P_{j }generate a key pair (pk, sk) where pk is a public key for homomorphic encryption and sk is the secret decryption key known only to the parties, but not to the processor. The elementwise encryption of a is defined as ξ(a). The parties send ξ(π_{1}(a))=π_{1}(ξ(a)) to the processor.
The processor generates a random vector r=(r_{1}, r_{2}, . . . , r_{K}) where the elements r_{i }are integers selected uniformly at random and are equally likely to be positive or negative. Then, the processor computes ξ(π_{1}(a_{j}))ξ(r_{j})=ξ(π_{1}(a_{j})+r_{j}). In vector notation, the processor computes ξ(π_{1}(a)+r).
Similarly, by subtracting the same random integers in the same order to the received additive shares, the processor obtains π_{1}(b)−r, selects a permutation π_{2 }at random and obtains a signal
π_{2}(ξ(π_{1}(a)+r))=ξ(π_{2}(π_{1}(a)+r)),
and a signal π_{2}(π_{1}(b)−r). The processor transmits the signal ξ(π_{2}(π_{1}(a)+r)) to the individual parties in the order, e.g., a first element to first party P_{1}, a second element to a second party P_{2}, . . . , and K^{th }element to a party P_{K}.
Each party decrypts the signal received from the processor, i.e., the parties P_{1}, P_{2}, . . . , P_{K }respectively possess the elements of the vector π_{2}(π_{1}(a)+r) while the processor possesses the vector π_{2}(π_{1}(b)−r). Because π_{1 }is unknown to the processor and π_{2 }is unknown to the parties, the indices in both vectors are obfuscated.
If π_{2}(π_{1}(a)+r)=ã and π_{2}(π_{1}(b)−r)={tilde over (b)}, then n_{i}>n_{j}ã_{i}+{tilde over (b)}_{i}>ã_{j}+{tilde over (b)}_{j}ã_{i}−ã_{j}>{tilde over (b)}_{j}−{tilde over (b)}_{i}.
For each (i, j) pair with i,jε{1, 2, . . . , K}, these comparisons can be solved by any implementation of a secure millionaire protocol. When all the comparisons are done, the processor determines the index {tilde over (j)} 325 such that ã_{{tilde over (j)}}+{tilde over (b)}_{{tilde over (j)}}=min_{j}n_{j}. However, the true index corresponding to the smallest dataset is obfuscated.
Selecting Obliviously First Additive Share of Noise Value of Smallest Database
Based on the obfuscated index 315, the processor selects 330 obliviously from additive shares 340 of all noise values a first additive share 335 of a noise value associated with the smallest database. A second additive share 360 of the noise value associated with the smallest database is stored by one or more databases.
For example, the processor constructs an indicator vector u of length K such that u_{{tilde over (j)}}=1 and all other elements are 0. Then the processor permutes the indicator vector to produce a permuted vector π_{2}^{−1}(u), where π_{2}^{−1 }inverts π_{2}. Next, the processor generates a keypair (pk′, sk′) for additive homomorphic function ζ(•), where only the encryption key pk′ is publicly available to the parties P_{j}, and transmits ζ(π_{2}^{−1}(u))=π_{2}^{−1}(ζ(U)) to the parties P_{j}.
The parties mutually obtain a permuted vector π_{1}^{−1}(π_{2}^{−1}(ζ(u)))=ζ(v) where π_{1}^{−1 }inverts the permutation π_{1 }originally applied by the parties P_{j }and v is the underlying vector. Now that both permutations have been removed, the index of the nonzero element in the indicator vector v corresponds to the true index of the smallest database. However, since the parties P_{j }cannot decrypt ζ(•), the parties cannot find out this index.
For j=1, . . . , K, party P_{j }selects the noise value η_{j}. In one embodiment, the noise value is a ddimensional noise vector sampled from a Laplace distribution
with parameter
In another embodiment, the noise value selected from different distributions. In yet another embodiment, the noise value is predetermined. Then, it obtains a ddimensional vector ψ_{j }where for i=1, . . . , d, ψ_{j}(i)=ζ(v(j))^{n}^{j}^{(i)}=ζ(v(j)η_{j}(i)).
All parties P_{j }compute a ddimensional noise vector ψ such that, for i=1, . . . , d, ψ(i)=π_{j}ψ_{j}9i)=π_{j}ζ(v(j)η_{j}(i))=ζ(Σ_{j}v(j)η_{j}(i)).
By construction, the above equation selects only the noise value for the smallest database, while rejecting the noise values for all other databases. This is because has an element with value 1 at the index corresponding to the smallest database and has zeroes everywhere else.
One of the parties, e.g., P_{1}, generates a ddimensional random integer noise vector S to produce the first additive share 335 ψ(i)ζ(s(i)) for all i=1, . . . , d, and transmits the first additive share to the processor. Also, the party P_{1 }stores a second additive share 360 of the noise value, e.g., by computing w_{1}−Ks wherein w_{1 }is the classifier of the party. Additionally or alternatively, the second additive share is stored by at multiple databases.
The processor decrypts ψ(i)ζ(s(i)) to obtain η(i)+s(i) for i=1, . . . , d. Accordingly, the processor stores a first additive share of the noise value associated with the smallest database as K(η+s), selected party P_{1 }stores the second additive share of the noise value and the classifier as w_{1}−Ks, and all other parties P_{j}, j=2, . . . , k stores the classifiers w_{j}.
Obliviously Combining Classifiers, the First and the Second Additive Shares
In various embodiments, the processor and the K databaseowning parties execute a secure function evaluation protocol, such that each of the K+1 participants obtains an additive share of the differentially private aggregate classifier Kw^{s}. In some embodiments, additive shares are generated using a computationally secure protocol. In other embodiments additive shares are generated using an unconditionally secure protocol. The resulting K+1 shares form the differentially private aggregate classifier and are published, e.g., are stored in the memory 175.
The embodiments of the invention determine differentially private aggregate classifier to ensure differential privacy of multiple databases. This invention is based on the realization that, to achieve multiparty differential privacy, it is sufficient to select the stochastic component based on the size of the smallest database. Some embodiments further recognize that because of this realization the selection of the noise value can be securely performed via the SMC method.
However, unlike conventional methods, the embodiments do not use SMC to construct a classifier. Therefore, our embodiments are significantly less complex than any SMC method to compute the classifier on the combined data.
It is to be understood that various other adaptations and modifications can be made within the spirit and scope of the invention. Therefore, it is the object of the appended claims to cover all such variations and modifications as come within the true spirit and scope of the invention.
It is sufficient to select the stochastic component based on the size of the smallest database. Below, a theoretical proof is given that this is indeed the case.
We show that the perturbed aggregate classifier in this invention satisfies differential privacy. We use the following bound on the sensitivity of the regularized regression classifier:
Theorem 1 Given a set of n data instances lying in a ball of radius 1, the sensitivity of regularized logistic regression function is at most
If w^{1 }and w_{2 }are function (classifiers) trained on adjacent databases of size n with regularization parameter λ,
This bound has been proved by Kamalika Chaudhuri and Claire Monteleoni in Privacypreserving logistic regression. Neural Information Processing Systems, pages 289296, 2008 and is incorporated herein by reference. To show that the perturbed function or classifier in our invention satisfies differential privacy, proceed as follows:
Theorem 2 The classifier w^{3 }preserves εdifferential privacy. For any two adjacent databases D and D′,
Proof Consider the case where one instance of the training database D is changed to result in an adjacent database D′. This would imply a change in one element in the training database of one party and thereby a change in the corresponding learned vector w_{j}^{δ}. Assuming that the change is in the database of the party P_{j}, the change in the learned vector is only going to be in w_{j}; let denote the new classifier by w_{j′}. In Theorem 1, we bound the sensitivity of w_{j }as
Considering that we learn the same vector w^{x }using either the training databases D and D′, we have
by the definition of function sensitivity. Similarly, we can lower bound the ratio by exp(−ε)
As expected, adding a perturbation noise term introduces an error in the function evaluation. For functions used as classifiers, this error is called as excess error or excess risk. It is the price paid for differential privacy. For a differentially private classifier to be useful, it is desired that the excess risk is small. In other words, it is desirable that adding noise does not deteriorate the classification performance by too much.
In the following discussion, we consider how much excess error is introduced when using the perturbed aggregate classifier w^{S }in this invention (satisfying differential privacy) as opposed to the nonprivate unperturbed classifier w^{x }trained on the entire training data. We also consider how much excess error is introduced with respect to the (nonprivate) unperturbed aggregate classifier w.
We first establish a bound on the l_{2 }norm of the difference between the aggregate classifier w and the classifier trained over the entire training, data. To prove the hound we apply the following Lemma
Lemma 1 Let G(w) and g(w) be two differentiable, convex functions of w. If w_{1}=arg min_{w}G(w) and w_{2}=arg min_{w}G(w)+g(w), then
where
g_{1}=max_{w}∥∇g(w)∥ and G_{2}=min_{v}min_{w}v^{T}∇^{2}G(w)v for any unit vector vεR^{d}.
Lemma 1 is obtained from Kamalika Chaudhuri and Claire Monteiconi in Privacypreserving logistic regression. Neural Information Processing Systems, pages 289296, 2008 and is incorporated herein by reference. First, consider the following theorem bounding the excess risk between the nonprivate unperturbed aggregate classifier w and the nonprivate classifier w^{x }tuned on the entire database.
Theorem 3 Given the aggregate classifier w, the classifier w^{x }trained over the entire training data and η_{(1) }is the size of the smallest training database,
We formulate the problem of estimating the individual classifiers w_{j }and the classifier w^{x }trained over the entire training data in terms of minimizing the two differentiable and convex functions g(w) and G(w).
Substituting the bounds on g_{1 }and G_{2 }in Lemma 6.2,
Applying triangle inequality,
The bound is inversely proportional to the number of instances in the smallest database. This indicates that when the databases are of disparate sizes, w will be a lot different from w^{x}. The largest possible value for η_{(1) }is
in which case all parties having an equal amount of training data and w will be closest to w^{x}. In the one party case for K=1, the bound indicates that norm of the difference would be upper bounded by zero, which is a valid sanity check as the aggregate classifier w is the same as w_{x}.
We use this result to establish a bound on the empirical risk of the perturbed aggregate classifier w^{δ}=w+η over the empirical risk of the unperturbed classifier w^{x }in the following theorem.
Theorem 4 If all data instances x_{i }lie in a unit ball, with probability at least 1δ, the empirical regularized excess risk of the perturbed aggregate classifier w^{s }over the classifier w_{s }trained over entire training data is
We use the Taylor series expansion of the function J to have
J({circumflex over (w)}^{x})=J(w^{x})+({circumflex over (w)}^{x}−w^{x})^{T}∇J(w^{x})+½({circumflex over (w)}^{x}−w^{x})^{T}∇^{2}J(w)({circumflex over (w)}^{s}−w^{x})
for some wΕR_{d}. By definition, VJ(w^{x})=0.
Taking l_{2 }norm Oft both sides and applying CauchySchwarz inequality,
J({circumflex over (w)}^{δ})−J(w^{x})≦½∥{circumflex over (w)}^{δ}−w^{x}∥^{2}∥∇^{2}J(w)∥. (8)
The second gradient of the regularized loss function for logistic regression is
Since the logistic function term is always less than one and all x_{i }lie in a unit ball, ∥∇_{2}J(w)∥≦λ+1. Substituting, this into Equation 8 and using the fact that J(w^{x})≦K(w), ∀wεR^{d},
The classifier w^{δ} is the perturbed aggregate classifier, i.e., w^{s}=w+η, with the noise term
To bound ∥η∥ with probability at least 1−δ, we apply the following Lemma from Kamalika Chaudhuri and Claire Monteleoni in Privacypreserving logistic regression. Neural Information Processing Systems, pages 289296, 2008 and is incorporated herein by reference.
Lemma 2 Given a ddimensional random variable η: Lap(β) i.e.,
with probability at least 1−δ, the l_{2 }norm of the random variable is bounded us
Substituting this into Equation 9, we have
Using the CauchySchwarz inequality on the last term,
The bound suggests an error because of two factors: aggregation and perturbation. The bound increases for smaller values of ε implying a tighter definition of differential privacy, indicating a clear tradeoff between privacy and utility. The bound is also inversely proportional to n_{(1)}^{2 }implying an increase in excess risk when the parties have training databases of disparate sizes.
In the limiting case ε→∞, we are adding a perturbation term η sampled from a Laplacian distribution of infinitesimally small variance resulting in the perturbed classifier being almost as same as using the unperturbed aggregate classifier w satisfying a very loose definition of differential privacy. With such a value of ε, our bound becomes
Similar to the analysis of Theorem 3, the excess error in using an aggregate classifier is inversely proportional to the size of the smallest database n_{(1) }and in the one party case K=1, the bound becomes zero as the aggregate classifier w is the same as w^{x}.
While the previous theorem gives us a bound on the empirical excess risk over a given training database, it is important to consider a bound on the true excess risk of w^{s }over w^{x}. Let us denote the true risk of the classifier w^{s }by {tilde over (J)}(w^{x})=E[J(w^{x})] and similarly, the true risk of the classifier w^{x }by {tilde over (J)}(w^{x})=E[J(w^{x})].
Theorem 5 If all training data instances x_{i }lie in a unit ball, with probability at least 1−δ, the true excess risk of the perturbed aggregate classifier w^{x }over the classifier w^{x }trained over entire training data is
Let w^{r }be the classifier minimizing {tilde over (J)}(w). By rearranging the terms,
{tilde over (J)}(w^{x})={tilde over (J)}(w^{x})+[{tilde over (J)}(w^{s})−{tilde over (J)}(w^{r})]+[{tilde over (J)}(w^{r})−{tilde over (J)}(w^{x})]≦{tilde over (J)}(w^{x})+[{tilde over (J)}(w^{x})−{tilde over (J)}(w^{r})].
To proceed, we first need a bound between the true excess risk of any classifier as an expression of bound on the regularized empirical risk for that classifier and the classifier minimizing the regularized empirical risk. With probability at least 1−δ,
Substituting the bound from Theorem 4,
Substituting this bound into Equation 10 gives us a bound on the true excess risk of the classifier w^{x }over the classifier w^{x}.