System and Method for Identifying Locations Within Data
First Claim
1. A computer implemented method of marking data for processing, the method comprising:
- determining a rolling summary that identifies a particular pattern of stored data included in each respective region of a plurality of overlapping regions;
comparing at least one proper subset of the rolling summary to a predetermined value; and
recording a location identifier that identifies a location within the data where the at least one proper subset equals the predetermined value.
6 Assignments
0 Petitions
Accused Products
Abstract
Described are computer-based methods and apparatuses, including computer program products, for removing redundant data from a storage system. In one example, a data delineation process delineates data targeted for de-duplication into regions using a plurality of markers. The de-duplication system determines which of these regions should be subject to further de-duplication processing by comparing metadata representing the regions to metadata representing regions of a reference data set. The de-duplication system identifies an area of data that incorporates the regions that should be subject to further de-duplication processing and de-duplicates this area with reference to a corresponding area within the reference data set.
108 Citations
20 Claims
-
1. A computer implemented method of marking data for processing, the method comprising:
-
determining a rolling summary that identifies a particular pattern of stored data included in each respective region of a plurality of overlapping regions; comparing at least one proper subset of the rolling summary to a predetermined value; and recording a location identifier that identifies a location within the data where the at least one proper subset equals the predetermined value. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A system configured to mark data for processing, the system comprising:
-
data storage storing the data, the data including a plurality of overlapping regions; and a processor coupled to the data storage and configured to; determine a rolling summary of each respective region of the plurality of overlapping regions based on stored data included in each respective region, the rolling summary identifying a particular pattern of the stored data; compare at least one proper subset of the rolling summary to a predetermined value; and record a location identifier that identifies a location within the data where the at least one proper subset equals the predetermined value. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
-
17. A non-transitory computer readable medium storing computer readable instructions that, when executed by at least one processor, instruct the at least one processor to perform a method of marking data for processing, the method comprising:
-
determining a rolling summary that identifies a particular pattern of stored data included in each respective region of a plurality of overlapping regions; comparing at least one proper subset of the rolling summary to a predetermined value; and recording a location identifier that identifies a location within the data where the at least one proper subset equals the predetermined value. - View Dependent Claims (18, 19, 20)
-
Specification