Information Source Agent Systems and Methods for Distributed Data Storage and Management Using Content Signatures
First Claim
1. A method for identifying information source clients that have unique file distribution characteristics, comprising:
- identifying a group of interest including one or more information source clients within an enterprise environment;
generating a respective content signature summary corresponding to each of the one or more information source clients of the group of interest, wherein each respective content signature summary includes a content signature for a file that appears on the corresponding information source client;
determining a count of the one or more information source clients that contain the content signature for the file;
comparing the count to an outlier threshold to determine whether the file is an outlier file; and
determining that the file is an outlier file when the count is below the outlier threshold.
2 Assignments
0 Petitions
Accused Products
Abstract
Information source agent systems and methods for distributed content storage and management using content signatures that use file identicality properties are provided. A data management system is provided that includes a content engine for managing the storage of file content, a content signature generator that generates a unique content signature for a file processed by the content engine, a content signature comparator that compares content signatures and a content signature repository that stores content signatures. Information source agents are provided that include content signature generators and content signature comparators. Methods are provided for the efficient management of files using content signatures that take advantage of file identicality properties. Content signature application modules and registries exist within information source clients and centralized servers to support the content signature methods.
-
Citations
20 Claims
-
1. A method for identifying information source clients that have unique file distribution characteristics, comprising:
-
identifying a group of interest including one or more information source clients within an enterprise environment; generating a respective content signature summary corresponding to each of the one or more information source clients of the group of interest, wherein each respective content signature summary includes a content signature for a file that appears on the corresponding information source client; determining a count of the one or more information source clients that contain the content signature for the file; comparing the count to an outlier threshold to determine whether the file is an outlier file; and determining that the file is an outlier file when the count is below the outlier threshold. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A system, comprising:
-
at least one processor; and a memory operatively coupled to the at least one processor, the at least one processor configured to; identify a group of interest including one or more information source clients within an enterprise environment; generate a respective content signature summary corresponding to each of the one or more information source clients of the group of interest, wherein each respective content signature summary includes a content signature for a file that appears on the corresponding information source client; determine a count of the one or more information source clients that contain the content signature for the file; compare the count to an outlier threshold to determine whether the file is an outlier file; and determine that the file is an outlier file when the count is below the outlier threshold. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A tangible computer-readable device having instructions stored thereon that, when executed by at least one computing device, causes the at least one computing device to perform operations comprising:
-
identifying a group of interest including one or more information source clients within an enterprise environment; generating a respective content signature summary corresponding to each of the one or more information source clients of the group of interest, wherein each respective content signature summary includes a content signature for a file that appears on the corresponding information source client; determining a count of the one or more information source clients that contain the content signature for the file; comparing the count to an outlier threshold to determine whether the file is an outlier file; and determining that the file is an outlier file when the count is below the outlier threshold. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification