Information source agent systems and methods for distributed data storage and management using content signatures
First Claim
1. A method for identifying information source clients that have unique file distribution characteristics, comprising:
- identifying a group of interest including one or more information source clients within an enterprise environment;
generating a respective content signature summary corresponding to each of the one or more information source clients of the group of interest, wherein each respective content signature summary includes a content signature for a file that appears on the corresponding information source client;
determining a count of the one or more information source clients that contain the content signature for the file;
comparing the count to an outlier threshold to determine whether the file is an outlier file that is inappropriate for the one or more information source clients of the group of interest;
determining that the file is an outlier file when the count is below the outlier threshold;
determining a respective count of outlier files contained within each of the one or more information source clients;
identifying outlier devices of the one or more information source clients based on the respective count of outlier files for each of the one or more information source clients compared to an outlier device threshold; and
performing an administrative control action in response to determining that the file is an outlier file.
2 Assignments
0 Petitions
Accused Products
Abstract
Information source agent systems and methods for distributed content storage and management using content signatures that use file identicality properties are provided. A data management system is provided that includes a content engine for managing the storage of file content, a content signature generator that generates a unique content signature for a file processed by the content engine, a content signature comparator that compares content signatures and a content signature repository that stores content signatures. Information source agents are provided that include content signature generators and content signature comparators. Methods are provided for the efficient management of files using content signatures that take advantage of file identicality properties. Content signature application modules and registries exist within information source clients and centralized servers to support the content signature methods.
16 Citations
14 Claims
-
1. A method for identifying information source clients that have unique file distribution characteristics, comprising:
-
identifying a group of interest including one or more information source clients within an enterprise environment; generating a respective content signature summary corresponding to each of the one or more information source clients of the group of interest, wherein each respective content signature summary includes a content signature for a file that appears on the corresponding information source client; determining a count of the one or more information source clients that contain the content signature for the file; comparing the count to an outlier threshold to determine whether the file is an outlier file that is inappropriate for the one or more information source clients of the group of interest; determining that the file is an outlier file when the count is below the outlier threshold; determining a respective count of outlier files contained within each of the one or more information source clients; identifying outlier devices of the one or more information source clients based on the respective count of outlier files for each of the one or more information source clients compared to an outlier device threshold; and performing an administrative control action in response to determining that the file is an outlier file. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A system, comprising:
-
at least one processor; and a memory operatively coupled to the at least one processor, the at least one processor configured to; identify a group of interest including one or more information source clients within an enterprise environment; generate a respective content signature summary corresponding to each of the one or more information source clients of the group of interest, wherein each respective content signature summary includes a content signature for a file that appears on the corresponding information source client; determine a count of the one or more information source clients that contain the content signature for the file; compare the count to an outlier threshold to determine whether the file is an outlier file that is inappropriate for the one or more information source clients of the group of interest; determine that the file is an outlier file when the count is below the outlier threshold; determine a respective count of outlier files contained within each of the one or more information source clients; identify outlier devices of the one or more information source clients based on the respective count of outlier files for each of the one or more information source clients compared to an outlier device threshold; and perform an administrative control action in response to determining that the file is an outlier file. - View Dependent Claims (7, 8, 9, 10)
-
-
11. A tangible computer-readable device having instructions stored thereon that, when executed by at least one computing device, causes the at least one computing device to perform operations comprising:
-
identifying a group of interest including one or more information source clients within an enterprise environment; generating a respective content signature summary corresponding to each of the one or more information source clients of the group of interest, wherein each respective content signature summary includes a content signature for a file that appears on the corresponding information source client; determining a count of the one or more information source clients that contain the content signature for the file; comparing the count to an outlier threshold to determine whether the file is an outlier file that is inappropriate for the one or more information source clients of the group of interest; determining that the file is an outlier file when the count is below the outlier threshold; determining a respective count of outlier files contained within each of the one or more information source clients; identifying outlier devices of the one or more information source clients based on the respective count of outlier files for each of the one or more information source clients compared to an outlier device threshold; and performing an administrative control action in response to determining that the file is an outlier file. - View Dependent Claims (12, 13, 14)
-
Specification