Efficient publication of sparse data
First Claim
1. A method comprising:
- obtaining, at a computer executing a summarization engine, sparse data comprising a plurality of entries, wherein a majority of the plurality of entries comprise zero-valued entries, and wherein a minority of the plurality of entries comprise non-zero valued entries;
modifying, by the computer, one of the non-zero valued entries to obtain a resulting value;
determining, by the computer, that the resulting value satisfies a threshold;
in response to determining that the resulting value satisfies the threshold, adding, by the computer, the resulting value to a data summary;
sampling, by the computer, one of the zero-valued entries;
adding, by the computer, the one of the zero-valued entries to the data summary; and
publishing, by the computer, the data summary, wherein the data summary comprises an anonymized summary of the sparse data.
1 Assignment
0 Petitions
Accused Products
Abstract
The present disclosure is directed to systems, methods, and computer-readable storage media for publishing data. A data summary summarizing the data can be generated and published according to several publishing schemes. In some embodiments, non-zero entries are selected and modified and zero entries are sampled according to one or more distribution functions. The sampled and modified values are added to a data summary, or a sample of the sampled and modified values are added to the data summary. The data summary is published, released, used, or otherwise output. In other embodiments, priority values are assigned to each value associated with the data, and a number of entries with the highest values are selected and added to the data summary.
-
Citations
20 Claims
-
1. A method comprising:
-
obtaining, at a computer executing a summarization engine, sparse data comprising a plurality of entries, wherein a majority of the plurality of entries comprise zero-valued entries, and wherein a minority of the plurality of entries comprise non-zero valued entries; modifying, by the computer, one of the non-zero valued entries to obtain a resulting value; determining, by the computer, that the resulting value satisfies a threshold; in response to determining that the resulting value satisfies the threshold, adding, by the computer, the resulting value to a data summary; sampling, by the computer, one of the zero-valued entries; adding, by the computer, the one of the zero-valued entries to the data summary; and publishing, by the computer, the data summary, wherein the data summary comprises an anonymized summary of the sparse data. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A method comprising:
-
obtaining, by a computer executing a summarization engine, sparse data comprising a plurality of entries, wherein over half of the plurality of entries comprise zero-valued entries, and wherein less than half of the plurality of entries comprise non-zero valued entries; modifying, by the computer, one of the non-zero valued entries to obtain a resulting value; adding, by the computer, the resulting value to a data summary; sampling, by the computer, one of the zero-valued entries; adding, by the computer, the one of the zero-valued entries to the data summary; and publishing, by the computer, the data summary. - View Dependent Claims (12, 13, 14, 15, 16)
-
-
17. A method comprising:
-
obtaining, by a computer executing a summarization engine, sparse data comprising a plurality of entries, wherein a majority of the plurality of entries comprise zero-valued entries, and wherein a minority of the plurality of entries comprise non-zero valued entries; assigning, by the computer, a priority value to each of the plurality of entries; drawing, by the computer, a sample from the plurality of entries, the sample comprising a plurality of sampled entries; adding, by the computer, the plurality of sampled entries to a data summary; and publishing, by the computer, the data summary, wherein the data summary comprises an anonymized summary of the sparse data. - View Dependent Claims (18, 19, 20)
-
Specification