K-anonymity and L-diversity data anonymization in an in-memory database
First Claim
1. A method comprising:
- receiving an indication to perform data anonymization based on quasi attributes of a data set, wherein the data set includes both the quasi attributes and one or more sensitive attributes;
recursively performing partitioning of the data set based on one or more of the quasi attributes until both a first anonymization threshold corresponding to the quasi attributes is satisfied, wherein the first anonymization threshold is based on K-anonymity and indicates from how many other records that each record in one of the sub-partitions of the resultant data set is indistinguishable and a second anonymization threshold corresponding to the one or more sensitive attributes is satisfied for each of a plurality of sub-partitions produced as a result of the partitioning, wherein the second anonymization threshold is based on L-diversity and indicates a minimum number of sensitive values that exist in each sub-partition of the resultant data set; and
providing a resultant data set including a plurality of records of the data set corresponding to the plurality of sub-partitions that satisfy both the first anonymization threshold and the second anonymization threshold.
1 Assignment
0 Petitions
Accused Products
Abstract
Disclosed herein are system, method, and computer program product embodiments for data anonymization in an in-memory database. An embodiment operates by receiving an indication to perform data anonymization based on quasi attributes of a data set. Partitioning is recursively performed on the data set based on one or more of the quasi attributes until both a first anonymization threshold corresponding to the quasi attributes is satisfied and a second anonymization threshold corresponding to the one or more sensitive attributes is satisfied for each of a plurality of sub-partitions produced as a result of the partitioning. A resultant data set including a plurality of records of the data set corresponding to the plurality of sub-partitions that satisfy both the first anonymization threshold and the second anonymization threshold is provided.
24 Citations
16 Claims
-
1. A method comprising:
-
receiving an indication to perform data anonymization based on quasi attributes of a data set, wherein the data set includes both the quasi attributes and one or more sensitive attributes; recursively performing partitioning of the data set based on one or more of the quasi attributes until both a first anonymization threshold corresponding to the quasi attributes is satisfied, wherein the first anonymization threshold is based on K-anonymity and indicates from how many other records that each record in one of the sub-partitions of the resultant data set is indistinguishable and a second anonymization threshold corresponding to the one or more sensitive attributes is satisfied for each of a plurality of sub-partitions produced as a result of the partitioning, wherein the second anonymization threshold is based on L-diversity and indicates a minimum number of sensitive values that exist in each sub-partition of the resultant data set; and providing a resultant data set including a plurality of records of the data set corresponding to the plurality of sub-partitions that satisfy both the first anonymization threshold and the second anonymization threshold. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A system, comprising:
-
a memory; and at least one processor coupled to the memory and configured to; receive an indication to perform data anonymization based on quasi attributes of a data set, wherein the data set includes both the quasi attributes and one or more sensitive attributes; recursively partitioning of the data set based on one or more of the quasi attributes until both a first anonymization threshold corresponding to the quasi attributes is satisfied, wherein the first anonymization threshold is based on K-anonymity and indicates from how many other records that each record in one of the sub-partitions of the resultant data set is indistinguishable and a second anonymization threshold corresponding to the one or more sensitive attributes is satisfied for each of a plurality of sub-partitions produced as a result of the partitioning, wherein the second anonymization threshold is based on L-diversity and indicates a minimum number of sensitive values that exist in each sub-partition of the resultant data set; and provide a resultant data set including a plurality of records of the data set corresponding to the plurality of sub-partitions that satisfy both the first anonymization threshold and the second anonymization threshold. - View Dependent Claims (8, 9, 10, 11)
-
-
12. A non-transitory computer-readable device having instructions stored thereon that, when executed by at least one computing device, causes the at least one computing device to perform operations comprising:
-
receiving an indication to perform data anonymization based on quasi attributes of a data set, wherein the data set includes both the quasi attributes and one or more sensitive attributes; recursively performing partitioning of the data set based on one or more of the quasi attributes until both a first anonymization threshold corresponding to the quasi attributes is satisfied, wherein the first anonymization threshold is based on K-anonymity and indicates from how many other records that each record in one of the sub-partitions of the resultant data set is indistinguishable and a second anonymization threshold corresponding to the one or more sensitive attributes is satisfied for each of a plurality of sub-partitions produced as a result of the partitioning, wherein the second anonymization threshold is based on L-diversity and indicates a minimum number of sensitive values that exist in each sub-partition of the resultant data set; and providing a resultant data set including a plurality of records of the data set corresponding to the plurality of sub-partitions that satisfy both the first anonymization threshold and the second anonymization threshold. - View Dependent Claims (13, 14, 15, 16)
-
Specification