File deduplication in a file system
First Claim
1. A method for file deduplication in a file system by a processor, comprising:
- receiving one of a new file creation instruction, a file copy instruction, and a file update instruction specifying at least a file directory and a file name;
storing or updating inode information for a file upon creation, copying, or update of the file;
acquiring identification information which is newly assigned to the file upon creation, copying, or update of the file and is inherited by the file from a different file if the file is a copy of the different file, to thereby make a content of the file identifiable, wherein the identification information includes world wide unique identification (WWUID);
storing the identification information and an inode information number in the file directory;
storing the file name together and the identification information in an extended directory;
determining whether or not first identification information and second identification information match each other, the first identification information being the identification information acquired by the acquisition unit and assigned to a first file, the second identification information being the identification information acquired by the acquisition unit and assigned to a second file;
if the first identification information is determined to match the second identification information, preventing the first file and the second file from being stored as duplicate files in the file system;
registering, in count information, an increase in the number of pieces of identification information associated with the first file, when the second identification information becomes associated with the first file, the count information indicating the number of pieces of identification information associated with the first file, wherein the count information is a reference count number of the WWUID;
registering, in the count information, a decrease in the number of pieces of identification information associated with the first file, in response to an instruction to delete the first management information; and
deleting the first management information in response to the instruction to delete the first management information, and to also delete the first file if the count information after the registration by the second registration unit indicates that no identification information is associated with the first file.
0 Assignments
0 Petitions
Accused Products
Abstract
Each file is assigned in advance with a WWUID, newly assigned to a file upon the creation or update of the file and inherited from a file to a copied file when it is copied. In a backup apparatus, a file name reception unit receives the file name of a backup target file. A WWUID reception unit receives a WWUID corresponding to the file name. A WWUID search unit searches for the same WWUID in backup management information of a previous day stored in a backup destination. Only if the search is failed, a file operation instruction unit instructs the storing of the backup target file into the backup destination. Then, an Rcnt update instruction unit instructs the updating of the number of references made to the WWUID within the backup destination. A second management information update instruction unit then instructs updating of backup management information of the current day.
16 Citations
4 Claims
-
1. A method for file deduplication in a file system by a processor, comprising:
-
receiving one of a new file creation instruction, a file copy instruction, and a file update instruction specifying at least a file directory and a file name; storing or updating inode information for a file upon creation, copying, or update of the file; acquiring identification information which is newly assigned to the file upon creation, copying, or update of the file and is inherited by the file from a different file if the file is a copy of the different file, to thereby make a content of the file identifiable, wherein the identification information includes world wide unique identification (WWUID); storing the identification information and an inode information number in the file directory; storing the file name together and the identification information in an extended directory; determining whether or not first identification information and second identification information match each other, the first identification information being the identification information acquired by the acquisition unit and assigned to a first file, the second identification information being the identification information acquired by the acquisition unit and assigned to a second file; if the first identification information is determined to match the second identification information, preventing the first file and the second file from being stored as duplicate files in the file system; registering, in count information, an increase in the number of pieces of identification information associated with the first file, when the second identification information becomes associated with the first file, the count information indicating the number of pieces of identification information associated with the first file, wherein the count information is a reference count number of the WWUID; registering, in the count information, a decrease in the number of pieces of identification information associated with the first file, in response to an instruction to delete the first management information; and deleting the first management information in response to the instruction to delete the first management information, and to also delete the first file if the count information after the registration by the second registration unit indicates that no identification information is associated with the first file. - View Dependent Claims (2, 3, 4)
-
Specification