Information source agent systems and methods for distributed data storage and management using content signatures
First Claim
1. A method for identifying information source clients that have unique file distribution characteristics, comprising:
- identifying a group of interest comprising one or more information source clients within an enterprise environment, wherein the one or more information source clients of the group of interest run the same operating system or share a job function;
generating respective content signature summaries corresponding to the one or more information source clients of the group of interest, wherein the respective content signature summaries include content signatures for files that appear on the corresponding one or more information source clients;
determining a count of the one or more information source clients that contain a content signature for a file of the files that appear on the corresponding one or more information source clients;
characterizing the file as an outlier file that is inappropriate for the one or more information source clients of the group of interest upon determining that the count is below an outlier threshold, wherein the outlier threshold is a predetermined number greater than one;
determining a respective count of outlier files contained within each of the one or more information source clients; and
generating a usage report for each of the one or more information source clients of the group of interest, the usage report including the respective count and identification of outlier files contained within each of the one or more information source clients.
2 Assignments
0 Petitions
Accused Products
Abstract
Information source agent systems and methods for distributed content storage and management using content signatures that use file identicality properties are provided. A data management system is provided that includes a content engine for managing the storage of file content, a content signature generator that generates a unique content signature for a file processed by the content engine, a content signature comparator that compares content signatures and a content signature repository that stores content signatures. Information source agents are provided that include content signature generators and content signature comparators. Methods are provided for the efficient management of files using content signatures that take advantage of file identicality properties. Content signature application modules and registries exist within information source clients and centralized servers to support the content signature methods.
38 Citations
14 Claims
-
1. A method for identifying information source clients that have unique file distribution characteristics, comprising:
-
identifying a group of interest comprising one or more information source clients within an enterprise environment, wherein the one or more information source clients of the group of interest run the same operating system or share a job function; generating respective content signature summaries corresponding to the one or more information source clients of the group of interest, wherein the respective content signature summaries include content signatures for files that appear on the corresponding one or more information source clients; determining a count of the one or more information source clients that contain a content signature for a file of the files that appear on the corresponding one or more information source clients; characterizing the file as an outlier file that is inappropriate for the one or more information source clients of the group of interest upon determining that the count is below an outlier threshold, wherein the outlier threshold is a predetermined number greater than one; determining a respective count of outlier files contained within each of the one or more information source clients; and generating a usage report for each of the one or more information source clients of the group of interest, the usage report including the respective count and identification of outlier files contained within each of the one or more information source clients. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A system, comprising:
-
at least one processor; and a memory operatively coupled to the at least one processor, the at least one processor configured to; identify a group of interest including one or more information source clients within an enterprise environment, wherein the one or more information source clients of the group of interest run the same operating system or share a job function; generate respective content signature summaries corresponding to the one or more information source clients of the group of interest, wherein the respective content signature summaries include content signatures for files that appear on the corresponding one or more information source clients; determine a count of the one or more information source clients that contain a content signature for a file of the files that appear on the corresponding one or more information source clients; characterize the file as an outlier file that is inappropriate for the one or more information source clients of the group of interest upon determining that the count is below an outlier threshold, wherein the outlier threshold is a predetermined number greater than one; determine a respective count of outlier files contained within each of the one or more information source clients; and generate a usage report for each of the one or more information source clients of the group of interest, the usage report including the respective count and identification of outlier files contained within each of the information source clients. - View Dependent Claims (7, 8, 9, 10)
-
-
11. A non-transitory computer-readable device having instructions stored thereon that, when executed by at least one computing device, causes the at least one computing device to perform operations comprising:
-
identifying a group of interest comprising one or more information source clients within an enterprise environment, wherein the one or more information source clients of the group of interest run the same operating system or share a job function; generating respective content signature summaries corresponding to the one or more information source clients of the group of interest, wherein the respective content signature summaries include content signatures for files that appear on the corresponding one or more information source clients; determining a count of the one or more information source clients that contain a content signature for a file of the files that appear on the corresponding one or more information source clients; characterizing the file as an outlier file that is inappropriate for the one or more information source clients of the group of interest upon determining that the count is below an outlier threshold, wherein the outlier threshold is a predetermined number greater than one; and determining a respective count of outlier files contained within each of the one or more information source clients; and generating a usage report for each of the one or more information source clients of the group of interest, the usage report including the respective count and identification of outlier files contained within each of the one or more information source clients. - View Dependent Claims (12, 13, 14)
-
Specification