Storage format for encoded vector indexes
First Claim
1. A method of storing a symbol table for an encoded vector index, said encoded vector index comprising a plurality of entries, each entry in said encoded vector corresponding to a record of a database, each entry in said encoded vector for storing a code corresponding to a key value in a field of the corresponding record of said database, said symbol table identifying codes corresponding to key values in said field, comprising providing a hash table comprising a plurality of entries, entries of said hash table for storing associated key values and codes, assigning code values to key values from said database, assigning a hash table entry to a key value from said database by performing a hash function upon a binary representation of said key value to produce an index into said hash table identifying a hash table entry, and storing respective key values and respective code values assigned to those key values into respective hash table entries assigned to said respective key values in said assigning step.
1 Assignment
0 Petitions
Accused Products
Abstract
A storage format and methods for improving the performance of the symbol table of an encoded vector index. The symbol table comprises a hash table, entries of the hash table storing associated key values and codes for the encoded vector index. Hash table entries store an accumulated count of occurrences of prior and the current key values, which improves the efficiency of responding to a request for a key range count. A binary radix tree is used to locate entries, which comprises a plurality of nodes, corresponding to binary digits of a binary representation of a key value. Codes are assigned to key values for the encoded vector index in a distributed fashion, so there are available code values between existing code values in the code ordering, that can be assigned to new key values, alleviating the need to reorganize the code values upon an insertion.
37 Citations
33 Claims
-
1. A method of storing a symbol table for an encoded vector index, said encoded vector index comprising a plurality of entries, each entry in said encoded vector corresponding to a record of a database, each entry in said encoded vector for storing a code corresponding to a key value in a field of the corresponding record of said database, said symbol table identifying codes corresponding to key values in said field, comprising
providing a hash table comprising a plurality of entries, entries of said hash table for storing associated key values and codes, assigning code values to key values from said database, assigning a hash table entry to a key value from said database by performing a hash function upon a binary representation of said key value to produce an index into said hash table identifying a hash table entry, and storing respective key values and respective code values assigned to those key values into respective hash table entries assigned to said respective key values in said assigning step.
-
5. A computer system storing a symbol table for an encoded vector index, said encoded vector index comprising a plurality of entries, each entry in said encoded vector corresponding to a record of a database, each entry in said encoded vector for storing a code corresponding to a key value in a field of the corresponding record of said database, said symbol table identifying codes corresponding to key values in said field, comprising
storage including a hash table comprising a plurality of entries, respective entries of said hash table storing respective key values and respective codes assigned to those key values, and computing circuitry assigning code values to key values from said database, and assigning a hash table entry to a key value from said database by performing a hash function upon a binary representation of said key value to produce an index into said hash table identifying a hash table entry.
-
6. A program product for storing a symbol table for an encoded vector index, said encoded vector index comprising a plurality of entries, each entry in said encoded vector corresponding to a record of a database, each entry in said encoded vector for storing a code corresponding to a key value in a field of the corresponding record of said database, said symbol table identifying codes corresponding to key values in said field, comprising
a hash table comprising a plurality of entries, respective entries of said hash table storing respective key values and respective codes assigned to those key values, relational database software assigning code values to key values from said database, and assigning a hash table entry to a key value from said database by performing a hash function upon a binary representation of said key value to produce an index into said hash table identifying a hash table entry, and a signal bearing media bearing the hash table and relational database software.
-
9. A method of identifying locations of data structures relating to key values used in an encoded vector index, comprising
providing a binary radix tree comprising a plurality of nodes, nodes of said tree corresponding to binary digits of a binary representation of a key value, branches in said tree corresponding to values of binary digits of said binary representation, a path through said tree corresponding to a binary representation of a key value ending in a terminal node containing a pointer to a hash table entry storing said key value.
-
10. A computer system identifying locations of data structures relating to key values used in an encoded vector index, comprising
storage including a binary radix tree comprising a plurality of nodes, nodes of said tree corresponding to binary digits of a binary representation of a key value, branches in said tree corresponding to values of binary digits of said binary representation, a path through said tree corresponding to a binary representation of a key value ending in a terminal node containing a pointer to a hash table entry storing said key value, and computing circuitry for scanning said binary radix tree to identify a terminal node and pointer to a hash table entry storing a key value.
-
11. A program product identifying locations of data structures relating to key values used in an encoded vector index, comprising
a binary radix tree comprising a plurality of nodes, nodes of said tree corresponding to binary digits of a binary representation of a key value, branches in said tree corresponding to values of binary digits of said binary representation, a path through said tree corresponding to a binary representation of a key value ending in a terminal node containing a pointer to a hash table entry storing said key value, relational database software for scanning said binary radix tree to identify a terminal node and pointer to a hash table entry storing a key value, and a signal bearing media bearing the binary radix tree and relational database software.
-
14. A method of assigning codes to key values used in a encoded vector index, comprising
providing an encoded vector index comprising a plurality of entries, each entry in said encoded vector corresponding to a record of a database, each entry in said encoded vector for storing a code corresponding to a key value in a field of the corresponding record of said database, assigning codes to each key value in said field of said database, said codes assigned such that when said codes are sorted according to a code ordering, the key values corresponding to said codes are also sorted according to a key value ordering, distributing said codes such that at least one pair of adjacent codes in said code ordering have nonsequential values.
-
19. A computer system assigning codes to key values used in a encoded vector index, comprising
storage including an encoded vector index comprising a plurality of entries, each entry in said encoded vector corresponding to a record of a database, each entry in said encoded vector for storing a code corresponding to a key value in a field of the corresponding record of said database, and computing circuitry assigning codes to each key value in said field of said database, said codes assigned such that when said codes are sorted according to a code ordering, the key values corresponding to said codes are also sorted according to a key value ordering, and distributing said codes such that at least one pair of adjacent codes in said code ordering have nonsequential values.
-
20. A program product for assigning codes to key values used in a encoded vector index, comprising
an encoded vector index comprising a plurality of entries, each entry in said encoded vector corresponding to a record of a database, each entry in said encoded vector for storing a code corresponding to a key value in a field of the corresponding record of said database, relational database software assigning codes to each key value in said field of said database, said codes assigned such that when said codes are sorted according to a code ordering, the key values corresponding to said codes are also sorted according to a key value ordering, and distributing said codes such that at least one pair of adjacent codes in said code ordering have nonsequential values, and a signal bearing media bearing the encoded vector index and relational database software.
-
23. A method of storing a symbol table for an encoded vector index, said encoded vector index comprising a plurality of entries, each entry in said encoded vector corresponding to a record of a database, each entry in said encoded vector for storing a code corresponding to a key value in a field of the corresponding record of said database, said symbol table identifying codes corresponding to key values in said field, comprising
storing in respective storage locations of said symbol table, respective key values and respective code values assigned to those key values, storing in a storage location of said symbol table, an accumulated count of occurrences in said database of a key value stored in sad storage location, and occurrences of key, values that precede, in a key value ordering, said key value stored in said storage location.
-
31. A computer system for storing a symbol table for an encoded vector index, said encoded vector index comprising a plurality of entries, each entry in said encoded vector corresponding to a record of a database, each entry in said encoded vector for storing a code corresponding to a key value in a field of the corresponding record of said database, said symbol table identifying codes corresponding to key values in said field, comprising
storage including a symbol table, and computing circuitry storing in respective storage locations of said symbol table, respective key values and respective code values assigned to those key values, and storing in a storage location of said symbol table, an accumulated count of occurrences in said database of a key value stored in said storage location, and occurrences of key values that precede, in a key value ordering, said key value stored in said storage location.
-
31-1. A program product for storing a symbol table for an encoded vector index, said encoded vector index comprising a plurality of entries, each entry in said encoded vector corresponding to a record of a database, each entry in said encoded vector for storing a code corresponding to a key value in a field of the corresponding record of said database, said symbol table identifying codes corresponding to key values in said field, comprising
a symbol table, relational database software storing in respective storage locations of said symbol table, respective key values and respective code values assigned to those key values, and storing in a storage location of said symbol table an accumulated count of occurrences in said database of a key value stored in said storage location, and occurrences of key values that precede, in a key value ordering, said key value stored in said storage location, and a signal bearing media bearing the binary radix tree and relational database software.
Specification