Large scale file storage in cloud computing
First Claim
Patent Images
1. In a computing environment, a method of storing files, the method comprising:
- identifying a file;
determining, based on a header of the file, that the file is a compressed container file that includes a plurality of compressed files;
identifying a hash calculated based on content of the compressed container file;
using a first portion of the hash, identifyig a particular file storage account from among a plurality of file storage accounts under which the compressed container file will be stored;
using a second portion of the hash, identifying a particular blob container from among a plurality of blob containers within the particular file storage account into which the compressed container file will be stored;
renaming the compressed container file based on the hash, the compressed container file being renamed a file name that includes one or more portions of the hash;
storing the compressed container file in the particular blob container within the particular file storage account;
decompressing each of the plurality of compressed files from the compressed container file, to obtain a plurality of decompressed files;
for each of the plurality of decompressed files;
identifying a hash corresponding to said decompressed file, the hash calculated based on content of said decompressed file;
using the corresponding hash, identifying an appropriate blob container from among the plurality of blob containers within an appropriate file storage account from among the plurality of file storage accounts into which said decompressed file will be stored;
renaming said decompressed file based on the corresponding hash, said decompressed file being renamed a file name that includes one or more portions of the corresponding hash; and
storing said decompressed file in the appropriate blob container within the appropriate file storage account; and
storing metadata linking each of the plurality of decompressed files to the compressed container file.
2 Assignments
0 Petitions
Accused Products
Abstract
Storing and retrieving files based on hashes for the files. One method for storing files includes: identifying a file; identifying a hash calculated based on the file; renaming the file based on the hash based on the file; and storing the file in a particular location based on the hash calculated based on the file. Another method for retrieving files includes: identifying a hash for a given file; using the hash, traversing a hierarchical file structure to find a location where the given file should be stored; determining that the file is at the location; and as a result, retrieving the file.
19 Citations
20 Claims
-
1. In a computing environment, a method of storing files, the method comprising:
-
identifying a file; determining, based on a header of the file, that the file is a compressed container file that includes a plurality of compressed files; identifying a hash calculated based on content of the compressed container file; using a first portion of the hash, identifyig a particular file storage account from among a plurality of file storage accounts under which the compressed container file will be stored; using a second portion of the hash, identifying a particular blob container from among a plurality of blob containers within the particular file storage account into which the compressed container file will be stored; renaming the compressed container file based on the hash, the compressed container file being renamed a file name that includes one or more portions of the hash; storing the compressed container file in the particular blob container within the particular file storage account; decompressing each of the plurality of compressed files from the compressed container file, to obtain a plurality of decompressed files; for each of the plurality of decompressed files; identifying a hash corresponding to said decompressed file, the hash calculated based on content of said decompressed file; using the corresponding hash, identifying an appropriate blob container from among the plurality of blob containers within an appropriate file storage account from among the plurality of file storage accounts into which said decompressed file will be stored; renaming said decompressed file based on the corresponding hash, said decompressed file being renamed a file name that includes one or more portions of the corresponding hash; and storing said decompressed file in the appropriate blob container within the appropriate file storage account; and storing metadata linking each of the plurality of decompressed files to the compressed container file. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A computing system for retrieving files, the computing system comprising:
-
one or more processors; and one or more computer readable media, wherein the one or more computer readable media comprise computer executable instructions that are executable by at least one of the one or more processors and that configure at least one of the one or more processors to perform the following; identify a hash for a given file; access metadata linking the hash to a compressed container file that is stored in a storage system and that includes a plurality of compressed files, and identify a plurality of additional hashes that each correspond to a different one of a plurality of decompressed files that are also stored in the storage system, each of the plurality of decompressed files corresponding to a different one of the plurality of compressed files; and for each of the plurality of additional hashes; using said additional hash, traverse a hierarchical file structure of the storage system to find a location where the corresponding decompressed file should be stored, by at least using a first portion of said additional hash to identify a file storage account under which the corresponding decompressed file is stored, and using a second portion of said additional hash to identify a blob container within the file storage account into which the corresponding decompressed file is stored; determine that the corresponding decompressed file is at the location defined by the blob container within the file storage account; as a result, retrieve the corresponding decompressed file from the storage system. - View Dependent Claims (10, 11, 12, 13)
-
-
14. A computing system for storing files, the computing system comprising:
-
one or more processors; and one or more computer readable media, wherein the one or more computer readable media comprise computer executable instructions that are executable by at least one of the one or more processors and that configure at least one of the one or more processors to perform the following; identify a file; determine, based on a header of the file, that the file is a compressed container file that includes a plurality of compressed files; identify a hash calculated based on content of the compressed container file; using a first portion of the hash, identify a particular file storage account in a distributed computing environment in which the file will be stored; using a second portion of the hash, identify a particular blob container hierarchically below the particular file storage account; rename the compressed container file based on the hash, the compressed container file being renamed a file name that includes one or more portions of the hash; store the compressed container file in the blob container decompress each of the plurality of compressed files from the compressed container file, to obtain a plurality of decompressed files; for each of the plurality of decompressed files; identify a hash corresponding to said decompressed file, the hash calculated based on content of said decompressed file; using the corresponding hash, identify an appropriate blob container within an appropriate file storage account into which said decompressed file will be stored; rename said decompressed file based on the corresponding hash, said decompressed file being renamed a file name that includes one or more portions of the corresponding hash; and store said decompressed file in the appropriate blob container within the appropriate file storage account; and store metadata linking each of the plurality of decompressed files to the compressed container file. - View Dependent Claims (15, 16, 17, 18, 19, 20)
-
Specification