Single instance store for file systems
First Claim
1. In a computer system having a file system of files, a method of storing data of first and second files having duplicate data, comprising the steps of:
- maintaining a single instance of the data;
for the first file, providing a link file to the single instance of the data, the link file representing the first file to provide logically separate file system access to the single instance representation of the file data, the link file logically separate from the second file such that file system actions via the link file do not affect the data of the second file; and
reclaiming storage space that was occupied by the duplicate data of the first file.
2 Assignments
0 Petitions
Accused Products
Abstract
A method and system for storing the data of files having duplicate content, by maintaining a single instance of the data, and providing logically separate links to the single instance. Files of duplicate content have their data stored in a common store file by a single instance store (SIS) facility, which also converts the original file or files to links to that common store file and creates additional links thereto as needed. The SIS facility may reside above a file system as a filter driver. File system requests directed to the link file (e.g., open, write, read, close and delete) reach the SIS filter, which then transparently handles each request as if the link file was a normal file. To preserve logical separation, writes to a SIS link file are to the link file, and the written portion recorded as dirty. The SIS filter intercepts SIS read requests, and reads clean portions from the common store file and any dirty portions from the link file. When the link file is closed, the common store file also may be closed, and, if the link file has been written, the non-dirtied portions of the link file are filled in with clean data from the common store file, and the link file reconverted to a normal file. Security is provided to prevent unauthorized access to the common store files, as is a volume check facility that repairs any inconsistencies in SIS metadata.
425 Citations
75 Claims
-
1. In a computer system having a file system of files, a method of storing data of first and second files having duplicate data, comprising the steps of:
-
maintaining a single instance of the data;
for the first file, providing a link file to the single instance of the data, the link file representing the first file to provide logically separate file system access to the single instance representation of the file data, the link file logically separate from the second file such that file system actions via the link file do not affect the data of the second file; and
reclaiming storage space that was occupied by the duplicate data of the first file. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31)
-
- 32. In a computer system, a system for storing the data of files having at least some duplicate data, comprising, a non-volatile storage for storing data including files of a file system, and a facility for maintaining a single instance of the duplicate data as a common store file of the file system, the facility providing a logically separate link file to the common store file for each file having duplicate data and deallocating storage space that stores the duplicate data, each link file logically separate from each other link file such that file system actions associated with one link file do not affect the data accessible via another link file, and the facility handling input and output requests to each link file to manage the linking of the link file to the common store file.
- 50. In a computer system having a file system, a method of storing the data of a selected plurality of files of the file system, wherein each of the selected plurality of files have at least partially identical contents with one another, comprising the steps of, maintaining a single instance file representing at least part of the file contents that are partially identical in each of the plurality of files, reclaiming at least some of the storage space that was identical in the selected plurality of files and providing a link file to the single instance file for each file having contents represented thereby, each link file logically separate from one another.
- 61. In a computer system having a file system of files, a method of storing the data of files having at least some duplicated data, comprising, maintaining a single instance of the data, for each file having duplicated data, providing a link file to the single instance of the data representing each file, each link file logically separate from each other link file, opening the link file, and associating a context with the link file.
- 68. In a computer system having a file system of files, a method of storing the data of files having at least some duplicated data, comprising, maintaining a single instance of the data, for each file having duplicated data, providing a link to the single instance of the data representing each file, each link logically separate from each other link, and associating a reparse point with the file.
-
71. In a computer system, a system for storing the data of files having at least some duplicate data, comprising, a non-volatile storage for storing data including files of a file system, and a facility for maintaining a single instance of the duplicate data as a common store file of the file system, the facility providing a logically separate link file to the common store file for each file having duplicate data, each link file having a reparse point associated therewith, the reparse point including information identifying the link file as associated with the facility and identifying the common store file pointed to by the link file, and the facility handling input and output requests to each link file to manage the linking of the link file to the common store file.
- 72. In a computer system, a system for storing the data of files having at least some duplicate data, comprising, a non-volatile storage for storing data including files of a file system, a facility for maintaining a single instance of the duplicate data as a common store file of the file system, the facility providing a logically separate link file to the common store file for each file having duplicate data, the facility handling input and output requests to each link file to manage the linking of the link file to the common store file, and a context associated with the link file.
- 74. In a computer system having a file system, a method of storing the data of a selected plurality of files of the file system, wherein each of the selected plurality of files have at least partially identical contents with one another, comprising the steps of, maintaining a single instance file representing at least part of the file contents that are partially identical in each of the plurality of files, providing a link to the single instance file for each file having contents represented thereby, each link logically separate from one another, and associating a context with at least one link.
Specification