METHOD AND APPARATUS FOR HARVESTING FILE SYSTEM METADATA
First Claim
1. A computer program product comprising one or more computer readable storage media storing instructions translatable by one or more processors to perform:
- accessing network file systems at one or more physical locations;
collecting file system metadata from the network file systems, wherein the file system metadata comprises one or more pieces of metadata of interest;
applying one or more user-defined heuristics to the one or more pieces of metadata of interest to generate one or more file system statistics of interest;
associating the one or more pieces of metadata of interest and the one or more file system statistics of interest with an identifier for a harvest; and
storing the harvest in a database.
2 Assignments
0 Petitions
Accused Products
Abstract
A harvester is disclosed for harvesting metadata of managed objects (files and directories) across file systems which are generally not interoperable in an enterprise environment. Harvested metadata may include 1) file system attributes such as size, owner, recency; 2) content-specific attributes such as the presence or absence of various keywords (or combinations of keywords) within documents as well as concepts comprised of natural language entities; 3) synthetic attributes such as mathematical checksums or hashes of file contents; and 4) high-level semantic attributes that serve to classify and categorize files and documents. The classification itself can trigger an action in compliance with a policy rule. Harvested metadata are stored in a metadata repository to facilitate the automated or semi-automated application of policies.
246 Citations
20 Claims
-
1. A computer program product comprising one or more computer readable storage media storing instructions translatable by one or more processors to perform:
-
accessing network file systems at one or more physical locations; collecting file system metadata from the network file systems, wherein the file system metadata comprises one or more pieces of metadata of interest; applying one or more user-defined heuristics to the one or more pieces of metadata of interest to generate one or more file system statistics of interest; associating the one or more pieces of metadata of interest and the one or more file system statistics of interest with an identifier for a harvest; and storing the harvest in a database. - View Dependent Claims (2)
-
-
3. A computer program product comprising one or more computer readable storage media storing instructions translatable by one or more processors to perform:
-
accessing network file systems at one or more physical locations; collecting file system metadata from the network file systems; transforming the collected file system metadata into metadata records having a common representation; utilizing the collected file system metadata in the metadata records to generate synthetic metadata; and storing the metadata records with the synthetic metadata in a metadata repository. - View Dependent Claims (4, 5, 6, 7, 8, 9)
-
-
10. A method for harvesting file system metadata, comprising:
-
accessing network file systems at one or more physical locations; collecting file system metadata from the network file systems; transforming the collected file system metadata into metadata records having a common representation; utilizing the collected file system metadata in the metadata records to generate synthetic metadata; and storing the metadata records with the synthetic metadata in a metadata repository. - View Dependent Claims (11, 12, 13, 14, 15)
-
-
16. A system for harvesting file system metadata, comprising:
-
a grazer module executing on the system and capable of; accessing network file systems at one or more physical locations; collecting file system metadata from the network file systems; transforming the collected file system metadata into metadata records having a common representation; and placing the metadata records in a first queue; an improver module executing on the system and capable of; reading the metadata records from the first queue; generating synthetic metadata from the file system metadata in the metadata records; and placing the metadata records with the synthetic metadata in a second queue; and a populator module executing on the system and capable of; reading the metadata records from the second queue; and storing the metadata records with the synthetic metadata in a metadata repository. - View Dependent Claims (17, 18, 19, 20)
-
Specification