Scalable indexing
First Claim
1. A method of adapting a uniform access indexing process with a non-uniform access memory, the method comprising:
- storing a dictionary of index records in the non-uniform access memory, each index record comprising fields for an index key, a reference count and a physical block address, the index keys being uniformly distributed and unique;
maintaining a bucket translation table for mapping logical bucket identifiers to physical bucket locations of the memory including generating a logical bucket identifier by displacement hashing an index key, and the table comprising a mapping of the logical bucket identifier to a physical bucket location of the memory where the associated index record is stored;
collecting in cache a plurality of bucket entries, wherein each bucket entry comprises a set of index records having the same logical bucket identifier;
writing the collection of entries from the cache to contiguous physical bucket locations of the memory as a sequential write; and
updating the bucket translation table with the physical bucket locations for the bucket entries of the collection written from the cache to the memory.
1 Assignment
0 Petitions
Accused Products
Abstract
Method and apparatus for constructing an index that scales to a large number of records and provides a high transaction rate. New data structures and methods are provided to ensure that an indexing algorithm performs in a way that is natural (efficient) to the algorithm, while a non-uniform access memory device sees IO (input/output) traffic that is efficient for the memory device. One data structure, a translation table, is created that maps logical buckets as viewed by the indexing algorithm to physical buckets on the memory device. This mapping is such that write performance to non-uniform access SSD and flash devices is enhanced. Another data structure, an associative cache is used to collect buckets and write them out sequentially to the memory device as large sequential writes. Methods are used to populate the cache with buckets (of records) that are required by the indexing algorithm. Additional buckets may be read from the memory device to cache during a demand read, or by a scavenging process, to facilitate the generation of free erase blocks.
-
Citations
30 Claims
-
1. A method of adapting a uniform access indexing process with a non-uniform access memory, the method comprising:
-
storing a dictionary of index records in the non-uniform access memory, each index record comprising fields for an index key, a reference count and a physical block address, the index keys being uniformly distributed and unique; maintaining a bucket translation table for mapping logical bucket identifiers to physical bucket locations of the memory including generating a logical bucket identifier by displacement hashing an index key, and the table comprising a mapping of the logical bucket identifier to a physical bucket location of the memory where the associated index record is stored; collecting in cache a plurality of bucket entries, wherein each bucket entry comprises a set of index records having the same logical bucket identifier; writing the collection of entries from the cache to contiguous physical bucket locations of the memory as a sequential write; and updating the bucket translation table with the physical bucket locations for the bucket entries of the collection written from the cache to the memory. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21)
-
-
22. A non-transitory computer readable medium storing instructions executable by a processor, the non-transitory machine readable medium comprising instructions to:
-
store a dictionary of index records in the non-uniform access memory, each index record comprising fields for an index key, a reference count and a physical block address, the index keys being uniformly distributed and unique; maintain a bucket translation table for mapping logical bucket identifiers to physical bucket locations of the memory, wherein a logical bucket identifier being generated by displacement hashing an index key, and the table comprising a mapping of the logical bucket identifier to a physical bucket location of the memory where the associated index record is stored; collect in cache a plurality of bucket entries, wherein each bucket entry comprises a set of index records having the same logical bucket identifier; write the collection of entries from the cache to contiguous physical bucket locations of the memory as a sequential write; and update the bucket translation table with the physical bucket locations for the bucket entries of the collection written from the cache to the memory. - View Dependent Claims (23, 24, 25, 26)
-
-
27. A computer system comprising:
-
a non-uniform access memory containing a dictionary of index records stored in physical bucket locations of the memory, each index record comprising fields for an index key, a reference count and a physical block address, the index keys being uniformly distributed and unique; a processor; and non-transitory machine readable medium storing instructions that, when executed, cause the processor to; maintain a bucket translation table to map a logical bucket identifier, generated by displacement hashing an index key of the dictionary, to a physical bucket location of the memory where an index record associated with the index key is stored; collect bucket entries in a cache, each bucket entry comprising a set of index records having the same logical bucket identifier to be written to the memory; write sequentially a collection of the bucket entries from the cache to contiguous physical bucket locations of the memory; and update the bucket translation table with the physical bucket locations for the bucket entries of the collection. - View Dependent Claims (28, 29, 30)
-
Specification