Mechanisms for Privately Sharing Semi-Structured Data
First Claim
1. A method, in a data processing system having a processor, for anonymizing data comprising a plurality of graph data sets, comprising:
- receiving, by the processor of the data processing system, input data comprising a plurality of graph data sets, wherein each graph data set comprises data for generating a separate graph from graphs associated with other graph data sets;
performing, by the processor, clustering on the graph data sets to generate a plurality of clusters, wherein at least one cluster of the plurality of clusters comprises a plurality of graph data sets and wherein other clusters in the plurality of clusters comprise one or more graph data sets;
determining, by the processor, for each cluster in the plurality of clusters, an aggregate property of the cluster; and
generating, by the processor, for each cluster in the plurality of clusters, synthetic data representing the cluster, from the determined aggregate properties of the clusters.
1 Assignment
0 Petitions
Accused Products
Abstract
Mechanisms are provided for anonymizing data comprising a plurality of graph data sets. The mechanisms receive input data comprising a plurality of graph data sets. Each graph data set comprises data for generating a separate graph from graphs associated with other graph data sets. The mechanisms perform clustering on the graph data sets to generate a plurality of clusters. At least one cluster of the plurality of clusters comprises a plurality of graph data sets. Other clusters in the plurality of clusters comprise one or more graph data sets. The mechanisms also determine, for each cluster in the plurality of clusters, aggregate properties of the cluster. Moreover, the mechanisms generate, for each cluster in the plurality of clusters, pseudo-synthetic data representing the cluster, from the determined aggregate properties of the clusters.
-
Citations
20 Claims
-
1. A method, in a data processing system having a processor, for anonymizing data comprising a plurality of graph data sets, comprising:
-
receiving, by the processor of the data processing system, input data comprising a plurality of graph data sets, wherein each graph data set comprises data for generating a separate graph from graphs associated with other graph data sets; performing, by the processor, clustering on the graph data sets to generate a plurality of clusters, wherein at least one cluster of the plurality of clusters comprises a plurality of graph data sets and wherein other clusters in the plurality of clusters comprise one or more graph data sets; determining, by the processor, for each cluster in the plurality of clusters, an aggregate property of the cluster; and generating, by the processor, for each cluster in the plurality of clusters, synthetic data representing the cluster, from the determined aggregate properties of the clusters. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A computer program product comprising a computer readable storage medium having a computer readable program recorded thereon, wherein the computer readable program, when executed on a computing device, causes the computing device to:
-
receive input data comprising a plurality of graph data sets, wherein each graph data set comprises data for generating a separate graph from graphs associated with other graph data sets; perform clustering on the graph data sets to generate a plurality of clusters, wherein at least one cluster of the plurality of clusters comprises a plurality of graph data sets and wherein other clusters in the plurality of clusters comprise one or more graph data sets; determine for each cluster in the plurality of clusters, an aggregate property of the cluster; and generate for each cluster in the plurality of clusters, synthetic data representing the cluster, from the determined aggregate properties of the clusters. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19)
-
-
20. An apparatus, comprising:
-
a processor; and a memory coupled to the processor, wherein the memory comprises instructions which, when executed by the processor, cause the processor to; receive input data comprising a plurality of graph data sets, wherein each graph data set comprises data for generating a separate graph from graphs associated with other graph data sets; perform clustering on the graph data sets to generate a plurality of clusters, wherein at least one cluster of the plurality of clusters comprises a plurality of graph data sets and wherein other clusters in the plurality of clusters comprise one or more graph data sets; determine for each cluster in the plurality of clusters, an aggregate property of the cluster; and generate for each cluster in the plurality of clusters, synthetic data representing the cluster, from the determined aggregate properties of the clusters.
-
Specification