Parallel file system and method with extensible hashing

US 5,893,086 A
Filed: 07/11/1997
Issued: 04/06/1999
Est. Priority Date: 07/11/1997
Status: Expired due to Term

First Claim

Patent Images

1. In a system which is used for storing and indexing a large set of data records and supports fast insert, delete and lookup operations and also sequential retrieval of all data records of said set, a method comprising:

providing a file system for said system which allows storing and retrieving data by specifying a key that identifies a data record,providing for a set of data records an index or directory with a single initial hash bucket, and storing in said initial hash by use of a hash function all records of a set of data records to be stored as long as they fit into said initial hash bucket, and when the initial hash bucket is full, splitting by adding a second hash bucket and adding one bit to said hash function used to place records whereby those records without said one bit are moved into said initial hash bucket while those records with said one bit are moved into said second hash bucket and wherein new records are added to the initial bucket or said second bucket depending upon the value of the bit for said hash function, and if a hash bucket fills up again, the bucket is split and two bits for the hash function determine where records from the second bucket are to be placed, while the records in the initial bucket are not effected by a new split of said second bucket, but can be split as well, andwherein each hash bucket is stored in a sparse file at an offset given as i*s, where i is the hash bucket number and s is the hash bucket size, an where a directory starts out as an empty file, where the file size increases to the size where it needs to be split by inserting records,and wherein upon a split, an additional bucket is written increasing the file size from s to 2*s upon the first split.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A computer system having a shared parallel disk file system running on a network for multiple computers each having their own instance of an operating system and with a protocol that makes disks appear to be locally attached to each file system. This parallel file system in a shared disk environment uses scalable directory service method improvements to caching and cache performance developments balance pools for multiple accesses. A metadata node manages file metadata, and locking techniques reduce the overhead of a token manager which is also used in the file system recovery if a computer participating in the management of shared disks becomes unavailable or failed. Synchronous and asynchronous takeover of a metadata node occurs for correction of metadata which was under modification and a new computer node to be a metadata node for that file. Locks are not constantly required to allocate new blocks on behalf of a user. Hash buckets are used and each hash bucket is stored in a sparse file at an offset given as i*s, where i is the hash bucket number and s is the hash bucket size, an where a directory starts out as an empty file, where the file size increases to the size where it needs to be split by inserting records, and wherein upon a split, an additional bucket is written increasing the file size from s to 2*s upon the first split. Lookup operations are performed with a step of computing the hash value of the key being looked up, as well as a hash tree depth as log-base-2 of the file size divided by hash bucket size, and with compute steps also computed for an insert operation.

Citations

10 Claims

1. In a system which is used for storing and indexing a large set of data records and supports fast insert, delete and lookup operations and also sequential retrieval of all data records of said set, a method comprising:
- providing a file system for said system which allows storing and retrieving data by specifying a key that identifies a data record,providing for a set of data records an index or directory with a single initial hash bucket, and storing in said initial hash by use of a hash function all records of a set of data records to be stored as long as they fit into said initial hash bucket, and when the initial hash bucket is full, splitting by adding a second hash bucket and adding one bit to said hash function used to place records whereby those records without said one bit are moved into said initial hash bucket while those records with said one bit are moved into said second hash bucket and wherein new records are added to the initial bucket or said second bucket depending upon the value of the bit for said hash function, and if a hash bucket fills up again, the bucket is split and two bits for the hash function determine where records from the second bucket are to be placed, while the records in the initial bucket are not effected by a new split of said second bucket, but can be split as well, andwherein each hash bucket is stored in a sparse file at an offset given as i*s, where i is the hash bucket number and s is the hash bucket size, an where a directory starts out as an empty file, where the file size increases to the size where it needs to be split by inserting records,and wherein upon a split, an additional bucket is written increasing the file size from s to 2*s upon the first split.
- View Dependent Claims (2, 3, 4)
- - 2. The method for said system of claim 1 wherein after several splits the several hash buckets are treated as a binary hash tree, whereby a record is found by traversing the tree from a root initial bucket node to a leaf hash bucket node using the hash bucket bits as a key to decide which branch to follow at each inner node of said binary tree.
  - 3. The method for said system of claim 1 wherein a sequential directory scan is accomplished by a depth-first tree transversal.
  - 4. The method for said system of claim 1 wherein several buckets are processed as a binary hash tree represented as a sparse file on disk, where records are relocated when a hash bucket is split, and a sequential directory scan traverses said hash tree such that all existing entries are returned exactly once.

5. In a system which is used for storing and indexing a large set of data records and supports fast insert, delete and lookup operations and also sequential retrieval of all data records of said set, a method comprising:
- providing a file system for said system which allows storing and retrieving data by specifying a key that identifies a data record,providing for a set of data records an index or directory with a single initial hash bucket, and storing in said initial hash by use of a hash function all records of a set of data records to be stored as long as they fit into said initial hash bucket, and when the initial hash bucket is full, splitting by adding a second hash bucket and adding one bit to said hash function used to place records whereby those records without said one bit are moved into said initial hash bucket while those records with said one bit are moved into said second hash bucket and wherein new records are added to the initial bucket or said second bucket depending upon the value of the bit for said hash function, and if a hash bucket fills up again, the bucket is split and two bits for the hash function determine where records from the second bucket are to be placed, while the records in the initial bucket are not effected by a new split of said second bucket, but can be split as well, andwherein lookup operations are performed with a step of computing the hash value of the key being looked up, as well as a hash tree depth as log-base-2 of the file size divided by hash bucket size, and said compute steps are also computed for an insert operation.
- View Dependent Claims (6, 7, 8, 9, 10)
- - 6. The method for said system of claim 5 wherein for a sparse file a hole is aligned on file system block barriers and file metadata that contains the location of a file'"'"'s disk blocks is cached, such that the bucket size is the same as the file system block size.
  - 7. The method for said system of claim 5 andwherein lookup operations are performed with a step of computing the hash value of the key being looked up, as well as a hash tree depth as log-base-2 of the file size divided by hash bucket size, and said compute steps are also computed for an insert operation.wherein a scan operation is provided that can be invoked repeatedly to return the contents of a hash tree as a sequential directory scan, wherein each call returns one or more records plus a content information value that is passed to a next scan call in order to retrieve a next set of records.
  - 8. The method for said system of claim 5 wherein hash bucket merges are handled during a sequential scan.
  - 9. The method for said system of claim 5 wherein said system is a parallel file system.
  - 10. The method for said system of claim 5 wherein said system is a used for one or more shared disk file systems implemented for multiple computers interconnected over a communication network with a protocol that makes disks appear to be locally attached to each file system.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
International Business Machines Corporation
Original Assignee
International Business Machines Corporation
Inventors
Engelsiepen, Thomas E., Schmuck, Frank B., Wyllie, James Christopher
Primary Examiner(s)
Amsbury, Wayne
Assistant Examiner(s)
PARDO, THUY N

Application Number

US08/893,724
Time in Patent Office

634 Days
Field of Search

395/680, 380/30, 707/1, 707/2, 707/10, 707/3
US Class Current

1/1
CPC Class Codes

G06F 11/1435   using file system or storag...

G06F 16/137   Hash-based content-based in...

G06F 16/1858   Parallel file systems, i.e....

Y10S 707/99931   Database or file accessing

Y10S 707/99932   Access augmentation or opti...

Parallel file system and method with extensible hashing

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

10 Claims

Specification

Solutions

Use Cases

Quick Links

Parallel file system and method with extensible hashing

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

10 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links