System and method for the intelligent management of archival data in a computer network
First Claim
1. A method for intelligently managing data while creating an archive of computer workstations in a computer network, comprising the steps of:
- (1) creating a signature of a file;
(2) determining whether said file is a redundant file present in a pre-determined number of computer workstations within the computer network based on said signature; and
(3) if said file is a redundant file, then;
(a) copying said redundant file to a common storage area;
(b) updating a commonality list with said signature of said redundant file;
(c) assigning said redundant file a commonality list identification number; and
(d) placing said commonality list identification number, but not said redundant file, into an archive storage area which is separate and distinct from said common storage area.
1 Assignment
0 Petitions
Accused Products
Abstract
A system for intelligently managing data in the archival and restoration processes of computer workstations located in a computer network. The system allows quicker, smaller back-ups of the computer network while facilitating the restoration process. The system identifies redundant data found in a pre-determined number of the computer workstations and copies them to a common storage area located on a high-speed storage medium. Placeholders are then placed into the archive file of any workstation that contained the redundant data. The system also identifies unique data present on the computer workstations and places them in the archive file as is the conventional practice. The system also provides, as part of an Inclusion/Exclusion Engine, intelligent rules for MIS personnel on the server-level to exclude files and directories deemed unique garbage and to include files deemed critical data in the archival process. These same rules can be used on the client-level to include or exclude the files and directories that users select. The intelligent inclusion/exclusion rules utilize an inclusion and an exclusion list, which are consulted during the archival process to further reduce the amount of data archived and reduce network resources associated with the archival process.
-
Citations
51 Claims
-
1. A method for intelligently managing data while creating an archive of computer workstations in a computer network, comprising the steps of:
-
(1) creating a signature of a file;
(2) determining whether said file is a redundant file present in a pre-determined number of computer workstations within the computer network based on said signature; and
(3) if said file is a redundant file, then;
(a) copying said redundant file to a common storage area;
(b) updating a commonality list with said signature of said redundant file;
(c) assigning said redundant file a commonality list identification number; and
(d) placing said commonality list identification number, but not said redundant file, into an archive storage area which is separate and distinct from said common storage area. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
(4) determining whether said file is a unique file present on one of the computer workstations within the computer network based on said signature and copying said unique file into said archive storage area.
-
-
3. The method of claim 1, wherein step (1) comprises the steps of:
-
(a) creating a data structure containing the name, size, creation date, and creation time of said file;
(b) creating a checksum based on the name, size, creation date, and creation time of said file; and
(c) placing said checksum into said data structure.
-
-
4. The method of claim 3, further comprising the steps of:
-
(4) determining when said signature of said redundant file has been previously encountered more than said pre-determined number of times; and
(5) determining whether said signature of said redundant file is already on said commonality list to ensure said redundant file is placed on said commonality list, copied to said common storage area, and assigned a commonality list identification number only once.
-
-
5. The method of claim 1, wherein said common storage area comprises a low latency, high-speed storage medium.
-
6. The method of claim 1, wherein said archive storage area comprises a magnetic tape storage medium.
-
7. The method of claim 1, wherein said commonality list is stored on a common storage area in the computer network.
-
8. The method of claim 7, further comprising the step of copying said commonality list onto each of the computer workstations in the computer network.
-
9. The method of claim 2, further comprising the steps of:
-
(5) dividing the plurality of computer workstations into a plurality of back-up sets;
(6) assigning a common storage area, a commonality list and a pre-determined number to each of said plurality of back-up sets.
-
-
10. The method of claim 1, further comprising the steps of:
-
(4) directly placing the signature of a new file on said commonality list, wherein said new file is known to be present in at least said pre-determined number of the computer workstations; and
(5) assigning said new file a commonality list identification number.
-
-
11. The method of claim 1, further comprising the step of:
(4) allowing a first user on one of the computer workstations and a second user on a sever within the computer network to include a file set on an inclusion list or an exclusion list based on a set of defined rules.
-
12. The method of claim 11, wherein step (4) comprises storing said inclusion list on said common storage area.
-
13. The method of claim 11, further comprising the steps of:
-
(5) identifying said file set present on one of the computer workstations within the computer network;
(6) determining when said file set is present on said inclusion list, wherein said inclusion list specifies which files to include in said archive storage area;
(7) determining when said file set is present on said exclusion list, wherein said exclusion list specifies which files not to include in said archive storage area; and
(8) determining whether to copy said file set to said archive storage area based on the determinations of steps (7) and (8).
-
-
14. The method of claim 13, wherein step (8) comprises the step of:
determining whether to copy said file set to said archive storage area, based on a priority level assignment of said set of defined rules, when said file set is on both said inclusion list and said exclusion list.
-
15. A method for restoring files to a computer workstation, within a computer network, from an archive storage area, comprising the steps of:
-
(1) detecting a commonality list identification number in the archive storage area;
(2) locating a file in a common storage area, which is separate and distinct from the archive storage area, that corresponds to said commonality list identification number; and
(3) copying said file from said common storage area to the computer workstation. - View Dependent Claims (16, 17, 18)
(4) detecting a unique file in the archive storage area; and
(5) copying said unique file from the archive storage area to the computer workstation.
-
-
17. The method of claim 15, wherein said common storage area comprises a low latency, high-speed storage medium.
-
18. The method of claim 15, wherein the archive storage area comprises a magnetic tape storage medium.
-
19. A system for intelligently managing data in an archival process and a restoration process of a plurality of computer workstations located in a computer network, comprising:
-
means for creating a signature of a file;
means for determining whether said file is a redundant file present in a pre-determined number of computer workstations within the computer network based on said signature;
means for copying said redundant file to a common storage area;
means for updating a commonality list with said signature of said redundant file;
means for assigning said redundant file a commonality list identification number; and
means for placing said commonality list identification number, into an archive storage area. - View Dependent Claims (20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34)
second means for determining whether said file is a unique file present on one of the computer workstations within the computer network based on said signature; and
second means for copying said unique file into said archive storage area.
-
-
21. The system of claim 19, further comprising:
-
means for reading said commonality list identification number located in said archive storage area;
means for locating said redundant file in said common storage area based on said commonality list identification number; and
second means for copying said redundant file from said common storage area to one of the plurality of computer workstations.
-
-
22. The system of claim 19, further comprising:
-
means for detecting a unique file in said archive storage area; and
second means for copying said unique file from said archive storage area to one of the plurality of computer workstations.
-
-
23. The system of claim 19, wherein said means for creating comprises:
-
means for creating a data structure containing the name, size, creation date, and creation time of said file;
means for creating a checksum based on the name, size, creation date, and creation time of said file; and
means for placing said checksum into said data structure.
-
-
24. The system of claim 23, further comprising:
-
first means for determining when said signature of said redundant file has been previously encountered more than said pre-determined number of times; and
second means for determining whether said signature of said redundant file is already on said commonality list to ensure said means for copying, said means for updating, and said means for assigning act upon said redundant file only once during the archival process.
-
-
25. The system of claim 19, wherein said common storage area comprises a low latency, high-speed storage medium.
-
26. The system of claim 19, wherein said second storage area comprises a magnetic tape storage medium.
-
27. The system of claim 19, wherein said commonality list is stored on a server in the computer network.
-
28. The system of claim 27, further comprising:
means for copying said commonality list to each of the plurality of computer workstations in the computer network.
-
29. The system of claim 20, wherein the plurality of computer workstations is divided into a plurality of back-up sets and each of said plurality of back-up sets is assigned a common storage area, a commonality list and a pre-determined number.
-
30. The system of claim 19, further comprising:
-
means for directly placing the signature of a new file on said commonality list, wherein said new file is known to be present in at least said pre-determined number of the plurality of computer workstations; and
means for assigning said new file a commonality list identification number.
-
-
31. The system of claim 19, further comprising:
means for allowing a first user on one of the computer workstations and a second user on a sever within the computer network to include a file set on an inclusion list or an exclusion list based on a set of defined rules.
-
32. The system of claim 31, wherein and said inclusion list is stored on said common storage area.
-
33. The system of claim 31, further comprising:
-
means for identifying said file set present on one of the computer workstations within the computer network;
means for determining when said file set is on said inclusion list, wherein said inclusion list specifies which files to include in said archive storage area;
means for determining when said file set is on said exclusion list, wherein said exclusion list specifies which files not to include in said archive storage area; and
means for determining whether to copy said file set into said archive storage area based on said first and second determining means.
-
-
34. The system of claim 33, wherein said means for copying said file set into said archive storage area comprises:
means for determining whether to copy said file set to said archive storage area based on a priority level assignment of said set of defined rules, when said file set is on both said inclusion list and said exclusion list.
-
35. A computer program product comprising a computer usable medium having computer readable program code means embodied in said medium for causing an application program to execute on a computer that intelligently manages data in an archival process and a restoration process of a plurality of computer workstations located in a computer network, said computer readable program code means comprising:
-
a first computer readable program code means for causing the computer to create a signature of a file;
a second computer readable program code means for causing the computer to determine whether said file is a redundant file present in a pre-determined number of computer workstations within the computer network based on said signature;
a third computer readable program code means for causing the computer to copy said redundant file to a common storage area;
a fourth computer readable program code means for causing the computer to update a commonality list with said signature of said redundant file;
a fifth computer readable program code means for causing the computer to assign said redundant file a commonality list identification number; and
a sixth computer readable program code means for causing the computer to place said commonality list identification number, but not said redundant file, into an archive storage area which is separate and distinct from said common storage area. - View Dependent Claims (36, 37, 38, 39, 40, 41, 42, 43, 44, 45)
a seventh computer readable program code means for causing the computer to determine whether said file is a unique file present on one of the computer workstations within the computer network based on said signature; and
an eighth computer readable program code means for causing the computer to copy said unique file into said archive storage area.
-
-
37. The computer program product of claim 35, further comprising:
-
a seventh computer readable program code means for causing the computer to read said commonality list identification number located in said archive storage area;
an eighth computer readable program code means for causing the computer to locate said redundant file in said common storage area based on said commonality list identification number; and
a ninth computer readable program code means for causing the computer to copy said redundant file from said common storage area to one of the plurality of computer workstations.
-
-
38. The computer program product of claim 35, further comprising:
-
a seventh computer readable program code means for causing the computer to detect a unique file in said archive storage area; and
an eighth computer readable program code means for causing the computer to copy said unique file from said archive storage area to one of the plurality of computer workstations.
-
-
39. The computer program product of claim 35, wherein said first computer readable program code means comprises:
-
a seventh computer readable program code means for causing the computer to create a data structure containing the name, size, creation date, and creation time of said file;
an eighth computer readable program code means for causing the computer to create a checksum based on the name, size, creation date, and creation time of said file; and
a ninth computer readable program code means for causing the computer to place said checksum into said data structure.
-
-
40. The computer program product of claim 39, further comprising:
-
an tenth computer readable program code means for causing the computer to determine when said signature of said redundant file has been previously encountered more than said pre-determined number of times; and
an eleventh computer readable program code means for causing the computer to determine whether said signature of said redundant file is already on said commonality list to ensure said third, fourth and fifth computer readable program code means act upon said redundant file only once during the archival process.
-
-
41. The computer program product of claim 35, further comprising:
-
a seventh computer readable program code means for causing the computer to directly place the signature of a new file on said commonality list, wherein said new file is known to be present in at least said pre-determined number of the plurality of computer workstations; and
an eighth computer readable program code means for causing the computer to assign said new file a commonality list identification number.
-
-
42. The computer program product of claim 35, further comprising:
a seventh computer readable program code means for causing the computer to allow a first user on one of the computer workstations and a second user on a sever within the computer network to include a file set on an inclusion list or an exclusion list based on a set of defined rules.
-
43. The computer program product of claim 42, further comprising:
-
an eighth computer readable program code means for causing the computer to identify a file set present on one of the computer workstations within the computer network; and
a ninth computer readable program code means for causing the computer to determine when said file set is on an inclusion list, wherein said inclusion list specifies which files to include in said archive storage area;
a tenth computer readable program code means for causing the computer to determine when said file set is on an exclusion list, wherein said exclusion list specifies which files not to include in said archive storage area; and
an eleventh computer readable program code means for causing the computer to determine whether to copy said file set into said archive storage area based on said ninth and tenth computer readable program code means.
-
-
44. The computer program product of claim 43, wherein said eleventh computer readable program code means comprises:
a twelfth computer readable program code means for causing the computer to determine whether to copy said file set to said archive storage area based on a priority level assignment of said set of defined rules, when said file set is on both said inclusion list and said exclusion list.
-
45. The computer program product of claim 42, further comprising:
-
an eighth computer readable program code means for causing the computer to use at least one of the following of said set of defined rules;
an “
Always Include”
rule;
a Sever Explicit File inclusion and exclusion rule;
a Server Wildcard File inclusion and exclusion rule;
a Client Explicit File inclusion and exclusion rule;
a Client Wildcard File inclusion and exclusion rule;
a Server Explicit Directory inclusion and exclusion rule;
a Server Wildcard Directory inclusion and exclusion rule;
a Client Explicit Directory inclusion and exclusion rule;
a Client Wildcard Directory inclusion and exclusion rule;
a Server Explicit Global inclusion and exclusion rule;
a Server Wildcard Global inclusion and exclusion rule;
a Client Explicit Global inclusion and exclusion rule; and
a Client Wildcard Global inclusion and exclusion rule.
-
-
46. An inclusion-exclusion engine for intelligently managing files and directories in an archival process of a plurality of computer workstations located in a client-server computer network, comprising:
-
means for excluding from the archival process files that are determined to be redundant data;
means for excluding from the archival process the files and directories that a first user on the server defines as unique garbage;
means for including in the archival process files that said first user defines as critical data;
means for excluding from the archival process files and directories selected by a second user on one of the plurality of computer workstations selects; and
means for including in the archival process files and directories selected by said second user selects. - View Dependent Claims (47, 48, 49)
a set of rules which allow said first user and said second user to select the files and directories to include in or exclude from the archival process.
-
-
48. The inclusion-exclusion engine of claim 47, wherein said set of rules comprises:
-
an “
Always Include”
rule;
a Sever Explicit File inclusion and exclusion rule;
a Server Wildcard File inclusion and exclusion rule;
a Client Explicit File inclusion and exclusion rule;
a Client Wildcard File inclusion and exclusion rule;
a Server Explicit Directory inclusion and exclusion rule;
a Server Wildcard Directory inclusion and exclusion rule;
a Client Explicit Directory inclusion and exclusion rule;
a Client Wildcard Directory inclusion and exclusion rule;
a Server Explicit Global inclusion and exclusion rule;
a Server Wildcard Global inclusion and exclusion rule;
a Client Explicit Global inclusion and exclusion rule; and
a Client Wildcard Global inclusion and exclusion rule.
-
-
49. The inclusion-exclusion engine of claim 47, further comprising:
a graphical user interface that allows said first user and said second user to utilize said set of defined rules.
-
50. A system for intelligently managing files and directories in an archival process of a computer network, comprising:
-
a plurality of computer workstations located in the computer network;
a server for initiating and controlling the archival process of said plurality of computer workstations;
a set of rules which allow a first user on one of the plurality of computer workstations and a second user on said server to select the files and directories to include in or exclude from the archival process;
an inclusion list that contains the signature of the files and directories included in the archival process based on a first subset of said set of rules selected by said first and said second user; and
an exclusion list that contains the signature of the files and directories excluded from the archival process based on either a second subset of said set of rules selected by said first and said second user or files determined by said server to be redundant data. - View Dependent Claims (51)
a graphical user interface (GUI) that runs on said server and on said plurality of computer workstations, wherein said GUI allows said first user and said second user to utilize said set of rules to select the files and directories to place on said inclusion list or on said exclusion list.
-
Specification