System and method for relating files in a distributed data storage environment
First Claim
1. A method for relating groups of files in a distributed data storage system having a primary storage site and a remote storage site, the method comprising:
- assigning a token to a file of the primary storage site;
passing a copy of the file from the primary storage site to the remote storage site;
passing a copy of the token from the primary storage site to the remote storage site;
assigning membership of the file to at least one of a plurality of groups of files residing on the remote storage site by comparing the token with other tokens on the remote storage site without retrieving the group of files corresponding thereto; and
in response to a request from the primary storage site, returning a sub-file and a corresponding base file to the primary storage site from the remote storage site, the relationship of the base file and the sub-file established with use of the token.
1 Assignment
0 Petitions
Accused Products
Abstract
A system and method for relating files in a distributed data storage environment allows for positive identification of membership of a file within a group, even in a loosely coupled environment where files are not available for comparison in real time. In disclosed embodiments, base files of a client are stored on a server and are accompanied by tokens uniquely identifying the base files. The tokens are generated on the client and may be derived from the contents of the base file using a digital signature. Each file transmitted to the server is accompanied with a token. Incremental backups may be used, and may employ file differencing. Accordingly, sub-files related to the base files may be transmitted to the server for backup. The sub-files are related to their respective base files using the tokens and are cross-linked to the base files so that any sub-files can be retrieved together with the base file from which the sub-file was derived.
-
Citations
23 Claims
-
1. A method for relating groups of files in a distributed data storage system having a primary storage site and a remote storage site, the method comprising:
-
assigning a token to a file of the primary storage site;
passing a copy of the file from the primary storage site to the remote storage site;
passing a copy of the token from the primary storage site to the remote storage site;
assigning membership of the file to at least one of a plurality of groups of files residing on the remote storage site by comparing the token with other tokens on the remote storage site without retrieving the group of files corresponding thereto; and
in response to a request from the primary storage site, returning a sub-file and a corresponding base file to the primary storage site from the remote storage site, the relationship of the base file and the sub-file established with use of the token.- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A method for relating groups of files in a distributed data storage system having a primary storage site and a remote storage site, the method comprising:
-
assigning a token to a base file of the primary storage site, the token uniquely identifying the base file and comprised of two components, a file identifier comprising attributes of the base file and an identification key derived from the contents of the base file;
passing a copy of the base file from the primary storage site to the remote storage site;
transferring the base file to a storage medium attached to the remote storage site;
passing a copy of the token from the primary storage site to the remote storage site;
storing the token in a token listing of the remote storage site;
deriving a sub-file from the base file, assigning a second token based upon the token of the base file to the sub-file, and passing the sub-file together with the second token to the remote storage site;
determining at the remote storage site the relation of the sub-file to the base file by comparing the second token to the token listing and matching the token of the base file;
creating a cross-linking between the sub-file and the base file; and
in response to a request from the primary storage site, returning the sub-file and the base file substantially together to the primary storage site from the remote storage site.
-
-
14. A system for relating groups of files in a distributed data storage system having a primary storage site and a remote storage site, the system comprising:
-
a token generation module within the primary storage site, the token generation module configured to generate tokens uniquely identifying files transmitted from the primary storage site to the remote storage site;
a token listing within the remote storage site;
a token comparison module within the remote storage site, the token comparison module configured to receive tokens passed in conjunction with transmission of a file from the primary storage site to the remote storage site and compare the tokens to one or more tokens within the token listing to establish a relationship of the file with other files previously transmitted from the primary storage site to the remote storage site without retrieving the other files; and
more storage devices of the remote storage site, each of the plurality of sub-files cross-linked with a base file resident within the storage devices after grouping the sub-files with the base files with use of the token comparison module.- View Dependent Claims (15, 16, 17, 18, 19, 20, 21, 22, 23)
-
Specification