Data anonymization based on guessing anonymity
First Claim
1. A method, implemented by a processing device, to perform anonymization of data through an application of noise to the data, the anonymization assessed according to a first anonymization parameter and a second anonymization parameter, the method comprising:
- determining, by the processing device, a desired level for the first anonymization parameter,the first anonymization parameter relating to privacy, andthe desired level relating to a minimum privacy;
determining, by the processing device, a range of noise parameter values constrained by the desired level for the first anonymization parameter;
determining, by the processing device and from the range of noise parameter values, a value for a noise parameter of a noise model that optimizes the second anonymization parameter,the second anonymization parameter relating to distortion;
applying, by the processing device, noise, generated by the noise model based on the determined value of the noise parameter, to the data to provide noise perturbed data,the noise, when applied to the data, ensuring anonymization performance, according to the first anonymization parameter, that satisfies the desired level; and
providing, by the processing device, the noise perturbed data to another processing device.
2 Assignments
0 Petitions
Accused Products
Abstract
Privacy is defined in the context of a guessing game based on the so-called guessing inequality. The privacy of a sanitized record, i.e., guessing anonymity, is defined by the number of guesses an attacker needs to correctly guess an original record used to generate a sanitized record. Using this definition, optimization problems are formulated that optimize a second anonymization parameter (privacy or data distortion) given constraints on a first anonymization parameter (data distortion or privacy, respectively). Optimization is performed across a spectrum of possible values for at least one noise parameter within a noise model. Noise is then generated based on the noise parameter value(s) and applied to the data, which may comprise real and/or categorical data. Prior to anonymization, the data may have identifiers suppressed, whereas outlier data values in the noise perturbed data may be likewise modified to further ensure privacy.
231 Citations
24 Claims
-
1. A method, implemented by a processing device, to perform anonymization of data through an application of noise to the data, the anonymization assessed according to a first anonymization parameter and a second anonymization parameter, the method comprising:
-
determining, by the processing device, a desired level for the first anonymization parameter, the first anonymization parameter relating to privacy, and the desired level relating to a minimum privacy; determining, by the processing device, a range of noise parameter values constrained by the desired level for the first anonymization parameter; determining, by the processing device and from the range of noise parameter values, a value for a noise parameter of a noise model that optimizes the second anonymization parameter, the second anonymization parameter relating to distortion; applying, by the processing device, noise, generated by the noise model based on the determined value of the noise parameter, to the data to provide noise perturbed data, the noise, when applied to the data, ensuring anonymization performance, according to the first anonymization parameter, that satisfies the desired level; and providing, by the processing device, the noise perturbed data to another processing device. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. An apparatus for anonymizing data through an application of noise to the data, the anonymization assessed according to a first anonymization parameter and a second anonymization parameter, the apparatus comprising:
-
a memory to store instructions; and at least one processor to execute the stored instructions to determine a desired level for the first anonymization parameter, the first anonymization parameter relating to privacy, and the desired level relating to a minimum privacy; determine a range of noise parameter values constrained by the desired level for the first anonymization parameter; determine, from the range of noise parameter values, a value for a noise parameter of a noise model that optimizes the second anonymization parameter, the second anonymization parameter relating to distortion; apply noise, generated by the noise model based on the determined value of the noise parameter, to the data to provide noise perturbed data, the noise, when applied to the data, ensuring anonymization performance according to the first anonymization parameter, that satisfies the desired level; and provide the noise perturbed data to at least one other processor. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
-
17. A non-transitory computer readable medium storing instructions, the instructions comprising:
one or more instructions which, when executed by at least one processor, cause the at least one processor to; determine a desired level for a first anonymization parameter, the first anonymization parameter relating to privacy, and the desired level relating to a minimum privacy; determine a range of noise parameter values constrained by the desired level for the first anonymization parameter; determine, from the range of noise parameter values, a value for a noise parameter of a noise model that optimizes a second anonymization parameter, the second anonymization parameter relating to distortion; apply noise, generated by the noise model based on the determined value of the noise parameter, to data to provide noise perturbed data, the noise, when applied to the data, ensuring anonymization performance according to the first anonymization parameter, that satisfies the desired level; and provide the noise perturbed data to at least one other processor. - View Dependent Claims (18, 19, 20, 21, 22, 23, 24)
Specification