DE-DUPLICATION DEPLOYMENT PLANNING
First Claim
Patent Images
1. A method comprising:
- dividing an address space of files into multiple containers;
performing a file metadata scan, including obtaining attributes for files in each container;
aggregating the file attributes into characterizations for each attribute dimension, and generating a content feature summary for each container incorporating the characterizations;
measuring a content similarity prediction measurement between containers from the generated content feature summary; and
assigning files from each container to a de-duplication domain based on the computed content similarity prediction measurement.
1 Assignment
0 Petitions
Accused Products
Abstract
Assignment of files to a de-duplication domain. Address space of data files is divided into multiple containers. For each of the containers, a file metadata scan is performed to obtain file system metadata, which is aggregated and summarized in a content feature summary. A content feature summary prediction measurement is measured between containers from the generated content feature summary, and files from each container are assigned to a de-duplication domain based upon the content similarity predication measurement.
12 Citations
7 Claims
-
1. A method comprising:
-
dividing an address space of files into multiple containers; performing a file metadata scan, including obtaining attributes for files in each container; aggregating the file attributes into characterizations for each attribute dimension, and generating a content feature summary for each container incorporating the characterizations; measuring a content similarity prediction measurement between containers from the generated content feature summary; and assigning files from each container to a de-duplication domain based on the computed content similarity prediction measurement. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
Specification