DATA ANONYMIZATION BASED ON GUESSING ANONYMITY
0 Assignments
0 Petitions
Accused Products
Abstract
Privacy is defined in the context of a guessing game based on the so-called guessing inequality. The privacy of a sanitized record, i.e., guessing anonymity, is defined by the number of guesses an attacker needs to correctly guess an original record used to generate a sanitized record. Using this definition, optimization problems are formulated that optimize a second anonymization parameter (privacy or data distortion) given constraints on a first anonymization parameter (data distortion or privacy, respectively). Optimization is performed across a spectrum of possible values for at least one noise parameter within a noise model. Noise is then generated based on the noise parameter value(s) and applied to the data, which may comprise real and/or categorical data. Prior to anonymization, the data may have identifiers suppressed, whereas outlier data values in the noise perturbed data may be likewise modified to further ensure privacy.
-
Citations
47 Claims
-
1-11. -11. (canceled)
-
12. A method comprising:
-
determining, by a processing device, a noise model to be applied to data; evaluating, by the processing device, a function based on the noise model to provide a first anonymization parameter; determining, by the processing device, a particular level for the first anonymization parameter, optimizing, by the processing device and based on evaluating the first anonymization parameter using the determined particular level, a second anonymization parameter; determining, by the processing device and using the optimized second anonymization parameter, a parameter value of a parameter of the noise model; and generating, by the processing device, noise based on the noise model and the determined parameter value. - View Dependent Claims (13, 14, 15, 16, 20)
-
-
17-19. -19. (canceled)
-
21-23. -23. (canceled)
-
24. An apparatus comprising:
-
a memory to store instructions; and a processor to execute the instructions to; receive a first anonymization parameter; receive information associated with a desired level for the first anonymization parameter; optimize, using the desired level for the first anonymization parameter, a second anonymization parameter; determine, using the optimized second anonymization parameter, a parameter value for a parameter of a noise model; provide, based on the noise model and the determined parameter value, noise; and apply the provided noise to data to generate noise perturbed data, the noise, when applied to the data, providing anonymization performance according to the first anonymization parameter and satisfying the desired level. - View Dependent Claims (25, 26, 27, 28, 29, 30, 31)
-
-
32-41. -41. (canceled)
-
42. A non-transitory computer readable medium storing instructions, the instructions comprising:
one or more instructions which, when executed by a processor, cause the processor to; determine a noise model to be applied to data; evaluate a function based on the noise model to provide a first anonymization parameter; determine a particular level for the first anonymization parameter; optimize, based on evaluating the first anonymization parameter using the determined particular level, a second anonymization parameter, determine, using the optimized second anonymization parameter, a parameter value of a parameter of the noise model; and generate noise based on the noise model and the determined parameter value. - View Dependent Claims (43, 44, 45, 46, 47)
Specification