Method for allocating files in a file system integrated with a RAID disk sub-system
First Claim
1. A method for storing blocks of data in a RAID array, said RAID array comprising a plurality of storage means, comprising the steps of:
- write allocating storage space in said RAID array for blocks of data to be stored in said RAID array by recording said write allocated storage space in a means for recording allocated storage space, said storage space comprising storage blocks in said plurality of storage means, said write allocation of said storage space in said RAID array comprising the steps of;
selecting a block of data that is to be stored in said RAID array for which storage space has not yet been allocated;
checking said block of data to determine if said block of data is in a different file or in a different read-ahead segment from a preceding block of data for which storage space in a first storage means of said plurality of storage means in said RAID array has been write allocated;
selecting storage space in a second storage means of said plurality of storage means in said RAID array when said block of data is in said different file or is in said different read-ahead segment and selecting storage space in said first storage means when said block of data is not in said different file or in said different read-ahead segment, said storage space comprising a write allocated storage block for said block of data;
assigning said write allocated storage block for said block of data to said block of data by associating an indicator of said write allocated storage block with said block of data;
adding said block of data to a list of writable blocks of data for said storage means of said plurality of storage means in said RAID array;
determining if all blocks of data to be stored in said RAID array have been processed, repeating said steps of selecting a block of data for which storage space has not yet been allocated and selecting storage space for said block of data when all of said blocks of data to be stored in said RAID array have not been processed;
writing all unwritten blocks of data in said list of writable blocks of data to storage space allocated for said blocks of data when all said blocks of data to be stored in said RAID array have been processed.
0 Assignments
0 Petitions
Accused Products
Abstract
The present invention is a method for integrating a file system with a RAID array that exports precise information about the arrangement of data blocks in the RAID subsystem. The file system examines this information and uses it to optimize the location of blocks as they are written to the RAID system. Thus, the system uses explicit knowledge of the underlying RAID disk layout to schedule disk allocation. The present invention uses separate current-write location (CWL) pointers for each disk in the disk array where the pointers simply advance through the disks as writes occur. The algorithm used has two primary goals. The first goal is to keep the CWL pointers as close together as possible, thereby improving RAID efficiency by writing to multiple blocks in the stripe simultaneously. The second goal is to allocate adjacent blocks in a file on the same disk, thereby improving read back performance. The present invention satisfies the first goal by always writing on the disk with the lowest CWL pointer. For the second goal, a new disk is chosen only when the algorithm starts allocating space for a new file, or when it has allocated N blocks on the same disk for a single file. A sufficient number of blocks is defined as all the buffers in a chunk of N sequential buffers in a file. The result is that CWL pointers are never more than N blocks apart on different disks, and large files have N consecutive blocks on the same disk.
-
Citations
61 Claims
-
1. A method for storing blocks of data in a RAID array, said RAID array comprising a plurality of storage means, comprising the steps of:
-
write allocating storage space in said RAID array for blocks of data to be stored in said RAID array by recording said write allocated storage space in a means for recording allocated storage space, said storage space comprising storage blocks in said plurality of storage means, said write allocation of said storage space in said RAID array comprising the steps of; selecting a block of data that is to be stored in said RAID array for which storage space has not yet been allocated; checking said block of data to determine if said block of data is in a different file or in a different read-ahead segment from a preceding block of data for which storage space in a first storage means of said plurality of storage means in said RAID array has been write allocated; selecting storage space in a second storage means of said plurality of storage means in said RAID array when said block of data is in said different file or is in said different read-ahead segment and selecting storage space in said first storage means when said block of data is not in said different file or in said different read-ahead segment, said storage space comprising a write allocated storage block for said block of data; assigning said write allocated storage block for said block of data to said block of data by associating an indicator of said write allocated storage block with said block of data; adding said block of data to a list of writable blocks of data for said storage means of said plurality of storage means in said RAID array; determining if all blocks of data to be stored in said RAID array have been processed, repeating said steps of selecting a block of data for which storage space has not yet been allocated and selecting storage space for said block of data when all of said blocks of data to be stored in said RAID array have not been processed; writing all unwritten blocks of data in said list of writable blocks of data to storage space allocated for said blocks of data when all said blocks of data to be stored in said RAID array have been processed. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A method for allocating storage space for files of a file system in a RAID array comprising N data storage means and parity storage means, each of said data storage means comprising a current write location (CWL) pointer referencing a next available writable block of said data storage means, comprising the steps of:
-
selecting an in-core inode referencing one or more dirty buffers from a plurality of inodes referencing dirty buffers; write allocating storage space comprising one or more storage blocks of one or more of said data storage means of said RAID array for data in said one or more dirty buffers referenced by said inode by visiting said one or more dirty buffers in sequential order, said step of write allocating said storage space comprising the steps of; checking said one or more dirty buffers to determine if each dirty buffer of said one or more dirty buffers contains data in a different file or in a different read-ahead segment from a preceding dirty buffer for which storage space in a first data storage means of said N data storage means of said RAID array has been write allocated; selecting as a current data storage means a second data storage means of said N data storage means of said RAID array having a lowest CWL pointer when data in said buffer is in said different file or in said different read-ahead segment;
otherwise retaining said first data storage means as a current data storage means;updating a blockmap of said file system to free any block of any data storage means previously assigned to said dirty buffer; assigning a next available writable block referenced by said CWL pointer of said current data storage means to said dirty buffer; adding said dirty buffer to a buffer queue assigned to said current data storage means of N buffer queues assigned to said N data storage means of said RAID array; writing data in buffers in said buffer queues to one or more stripes of said RAID array when a plurality of dirty buffers stored in said N buffer queues reference one or more write allocated blocks of a stripe below said lowest CWL pointer; determining if all inodes in said list of inodes have been processed, repeating said steps of selecting an in-core inode and write allocating storage space for said one or more dirty buffers referenced by said inode when all of said inodes in said list of inodes have not been processed; (D) writing to one or more stripes of said RAID array data in all unwritten buffers for which storage space has been write allocated in said RAID array. - View Dependent Claims (7, 8, 9, 10)
-
-
11. A method for integrating a file system having a plurality of files at least one of which has at least one data block, with a storage system comprising a plurality of storage devices having a plurality of storage blocks, said method comprising the steps of
providing said file system with current configuration information of said storage system, said current configuration information comprising information regarding which ones of said data blocks are allocated to which ones of said storage devices; writing, responsive to said information, a sequence of said data blocks from said files to selected said storage blocks of selected said storage devices. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45)
-
46. A method for integrating a file system having a plurality of files, with a storage system having a plurality of storage devices having a plurality of storage blocks, said method comprising
providing said file system with information regarding which ones of said storage blocks are allocated to which ones of said storage devices; - and
writing, responsive to said information, a sequence of data blocks from said files to selected said storage blocks of selected said storage devices.
- and
-
47. A method for storing a plurality of data blocks on a plurality of storage devices, each said storage device having a plurality of storage blocks, said method comprising
providing, for each said storage device, a pointer to one of said storage blocks on that storage device; - and
writing a data block to a selected said storage block on a selected said storage device and updating said pointer for said storage device, in such manner as to preserve a predetermined relationship between pairs of said pointers for said storage devices; wherein selected ones of said storage blocks for a first said storage device correspond to selected ones of said storage blocks for a second said storage device; pairs of said storage blocks for a first said storage device comprise a distance relationship; and said predetermined relationship comprises maintaining a least possible value for said distance relationship between said pointer for said first storage device and said pointer for said second storage device, while still allowing up to N blocks from a single file to be written to a single storage device, wherein N is a predetermined integer.
- and
-
48. A method for storing a plurality of data blocks on a plurality of storage devices, each said storage device having a plurality of storage blocks, said method comprising
providing, for each said storage device, a pointer to one of said storage blocks on that storage device; - and
writing a data block to a selected said storage block on a selected said storage device and updating said pointer for said storage device, in such manner as to preserve a predetermined relationship between pairs of said pointers for said storage devices; wherein each said pointer comprises an index value to a selected said storage block on a said storage device; and said relationship comprises maintaining a least possible distance between each pair of said index values, while still allowing up to N blocks from a single file to be written to a single storage device, wherein N is a predetermined integer.
- and
-
49. A method for storing a plurality of data blocks on a plurality of storage devices, each said storage device having a plurality of storage blocks, said method comprising
providing, for each said storage device, a pointer to one of said storage blocks on that storage device; - and
writing a data block to a selected said storage block on a selected said storage device and updating said pointer for said storage device, in such manner as to preserve a predetermined relationship between pairs of said pointers for said storage devices; comprising receiving a set of consecutive data blocks from a first said file; and writing said consecutive data blocks to a set of consecutive free storage blocks on a single said storage device.
- and
-
50. A method for storing a plurality of data blocks on a plurality of storage devices, each said storage device having a plurality of storage blocks, said method comprising
providing, for each said storage device, a pointer to one of said storage blocks on that storage device; - and
writing a data block to a selected said storage block on a selected said storage device and updating said pointer for said storage device, in such manner as to preserve a predetermined relationship between pairs of said pointers for said storage devices; comprising receiving a set of consecutive data blocks from a first said file; writing said consecutive data blocks to a set of consecutive free storage blocks on a first said storage device, up to a selected maximum number of storage blocks; and writing subsequent data blocks to a set of storage blocks on a second said storage device.
- and
-
51. A method for storing a plurality of data blocks on a plurality of storage devices, each said storage device having a plurality of storage blocks, said method comprising
providing, for each said storage device, a pointer to one of said storage blocks on that storage device; -
writing a data block to a selected said storage block on a selected said storage device and updating said pointer for said storage device, in such manner as to preserve a predetermined relationship between pairs of said pointers for said storage devices; receiving a set of data blocks from a first and a second said file; writing said data blocks from said first file to a first said storage device; and writing said data blocks from said second file to a second said storage device.
-
-
52. A system comprising
a file system having a plurality of files, at least some of said files having at least one file block; -
a plurality of storage devices, each having a plurality of storage blocks; memory or storage having information regarding which ones of said file blocks are stored on which ones of said storage devices; and means responsive to said information for writing data from said file blocks to said storage blocks; wherein said storage devices comprise a RAID storage subsystem.
-
-
53. A system comprising
a file system having a plurality of files, at least some of said files having at least one file block; -
a plurality of storage devices, each having a plurality of storage blocks; memory or storage having information regarding which ones of said file blocks are stored on which ones of said storage devices; and means responsive to said information for writing data from said file blocks to said storage blocks; wherein each said storage block for a first said storage device corresponds to a storage block for a second said storage device, said system comprising for each said storage device, a pointer representing one said storage block for said storage device; and wherein said means for writing data comprises (a) means for selecting a first storage device for writing to, (b) means for writing to the storage block represented by the pointer for said first storage device, and (c) means for updating said pointer for said first storage device, wherein said means for selecting assures that no two said pointers differ by more than a selected amount. - View Dependent Claims (54, 55, 56)
-
-
57. A system comprising
a file system having a plurality of files, at least some of said files having at least one file block; -
a plurality of storage devices, each having a plurality of storage blocks; memory or storage having information regarding which ones of said file blocks are stored on which ones of said storage devices; and means responsive to said information for writing data from said file blocks to said storage blocks; wherein each of said storage devices comprises a magnetic disk, a magneto-optical disk, or an optical disk.
-
-
58. A system comprising
a file system having a plurality of files, at least some of said files having at least one file block; -
a plurality of storage devices, each having a plurality of storage blocks; memory or storage having information regarding which ones of said file blocks are stored on which ones of said storage devices; means responsive to said information for writing data from said file blocks to said storage blocks; means for receiving a first plurality of file blocks in a first order, said first plurality of file blocks to be written to said storage devices; memory for storing said first plurality of file blocks; and means for writing said first plurality of file blocks to a first plurality of storage blocks in a second order, wherein said second order differs from said first order.
-
-
59. A system comprising
a file system having a plurality of files, at least some of said files having at least one file block; -
a plurality of storage devices, each having a plurality of storage blocks; memory or storage having information regarding which ones of said file blocks are stored on which ones of said storage devices; means responsive to said information for writing data from said file blocks to said storage blocks; means for receiving a first plurality of file blocks in a first order, said first plurality of file blocks to be written to said storage devices; memory for storing said first plurality of file blocks; and means for writing said first plurality of file blocks to first plurality of storage blocks in a second order, wherein said second order differs from said first order; wherein said first plurality of storage blocks comprises a stripe.
-
-
60. A system comprising
a file system having a plurality of files, at least some of said files having at least one file block; -
a plurality of storage devices, each having a plurality of storage blocks; memory or storage having information regarding which ones of said file blocks are stored on which ones of said storage devices; means responsive to said information for writing data from said file blocks to said storage blocks; means for receiving a first plurality of file blocks in a first order, said first plurality of file blocks to be written to said stroage devices; memory for storing said first plurality of file blocks; and means for writing said first plurality of file blocks to first plurality of storage blocks in a second order, wherein said second order differs from said first order; wherein said means for writing comprises means for waiting for file blocks sufficient to complete an entire stripe of corresponding storage blocks for a plurality of storage devices.
-
-
61. In a computer system having memory or storage, a data structure in said memory or storage, said data structure comprising
a current write location pointer for each storage device of a plurality of said storage devices; -
each said current write location pointer comprising an index of one storage block of a plurality of said storage blocks for a corresponding said storage device; each pair of said current write location pointers having a defined maximum distance responsive to a preselected distance function.
-
Specification