SELECTING FILES FOR COMPACTION
First Claim
1. A computer-implemented method comprising:
- identifying two or more files, each of which include multiple entries;
determining a respective size of each of the two or more files, each size being an estimate of how many distinct entries exist in the respective file that are not garbage entries;
determining a combined size of the two or more files, where the combined size of the two or more files is an arithmetic sum of the respective sizes of the two or more files;
estimating a compacted size of the two or more files, where the estimated compacted size of the two or more files is an estimate of how many distinct entries exist in the two or more files that are not garbage entries;
selecting the two or more files for compaction, based at least on a comparison of the combined size of the two or more files to the estimated compacted size of the two or more files; and
compacting the two or more selected files.
4 Assignments
0 Petitions
Accused Products
Abstract
Methods, systems, and apparatus for identifying two or more files, each of which include multiple entries, determining a respective size of each of the files, each size being an estimate of how many distinct entries exist in the respective file that are not garbage entries, determining a combined size of the files, where the combined size of the files is an arithmetic sum of the respective sizes of the files, estimating a compacted size of the files, where the estimated compacted size of the files is an estimate of how many distinct entries exist in the files that are not garbage entries, selecting the two or more files for compaction, based at least on a comparison of the combined size of the files to the estimated compacted size of the files, and compacting the two or more selected files.
29 Citations
20 Claims
-
1. A computer-implemented method comprising:
-
identifying two or more files, each of which include multiple entries; determining a respective size of each of the two or more files, each size being an estimate of how many distinct entries exist in the respective file that are not garbage entries; determining a combined size of the two or more files, where the combined size of the two or more files is an arithmetic sum of the respective sizes of the two or more files; estimating a compacted size of the two or more files, where the estimated compacted size of the two or more files is an estimate of how many distinct entries exist in the two or more files that are not garbage entries; selecting the two or more files for compaction, based at least on a comparison of the combined size of the two or more files to the estimated compacted size of the two or more files; and compacting the two or more selected files. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A system comprising:
-
a plurality of computers; and a non-transitory storage device storing instructions operable to cause the computers to perform operations comprising; identifying two or more files, each of which include multiple entries; determining a respective size of each of the two or more files, each size being an estimate of how many distinct entries exist in the respective file that are not garbage entries; determining a combined size of the two or more files, where the combined size of the two or more files is an arithmetic sum of the respective sizes of the two or more files; estimating a compacted size of the two or more files, where the estimated compacted size of the two or more files is an estimate of how many distinct entries exist in the two or more files that are not garbage entries; selecting the two or more files for compaction, based at least on a comparison of the combined size of the two or more files to the estimated compacted size of the two or more files; and compacting the two or more selected files. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A non-transitory storage device storing instructions operable to cause one or more computers to perform operations comprising:
-
identifying two or more files, each of which include multiple entries; determining a respective size of each of the two or more files, each size being an estimate of how many distinct entries exist in the respective file that are not garbage entries; determining a combined size of the two or more files, where the combined size of the two or more files is an arithmetic sum of the respective sizes of the two or more files; estimating a compacted size of the two or more files, where the estimated compacted size of the two or more files is an estimate of how many distinct entries exist in the two or more files that are not garbage entries; selecting the two or more files for compaction, based at least on a comparison of the combined size of the two or more files to the estimated compacted size of the two or more files; and compacting the two or more selected files. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification