Fast string searching and indexing using a search tree having a plurality of linked nodes
First Claim
1. A search tree for indexing a plurality of string entries, wherein each one of said plurality of string entries is a string of characters, comprising:
- a plurality of linked nodes consisting of a root node, a plurality of inner nodes wherein each one of said plurality of inner nodes is associated with a character or substring of characters, and a plurality of leaf nodes, wherein said each one of said plurality of inner nodes further comprises;
(a) a reference to a parent node, wherein said parent node is either said root node or another of said plurality of inner nodes;
(b) a first data field containing a character comparison position indicating the number of characters in said character or substring of characters associated with said one of said plurality of inner nodes;
(c) a second data field containing a comparison character, said comparison character used to determine whether said character or substring of characters associated with said one of said plurality of inner nodes is contained in a string entry at a character position of said string entry associated with said character comparison position;
(d) a reference to at least two successor nodes; and
(e) a hash table array containing a predetermined number of hash buckets.
2 Assignments
0 Petitions
Accused Products
Abstract
A fast string indexing method efficiently stores, searches, and removes alphanumeric or binary strings utilizing a compacted search tree. The number of levels in the search tree is minimized by having a node represent more than one character when possible. Each inner node of the tree contains a hash table array for successive hashing, which also minimizes the time required to traverse a given node. Searches may be performed for partial matches, such as wild cards at the character level. Multiple indices may be opened independently and concurrently on the same table of string entries.
264 Citations
48 Claims
-
1. A search tree for indexing a plurality of string entries, wherein each one of said plurality of string entries is a string of characters, comprising:
a plurality of linked nodes consisting of a root node, a plurality of inner nodes wherein each one of said plurality of inner nodes is associated with a character or substring of characters, and a plurality of leaf nodes, wherein said each one of said plurality of inner nodes further comprises; (a) a reference to a parent node, wherein said parent node is either said root node or another of said plurality of inner nodes; (b) a first data field containing a character comparison position indicating the number of characters in said character or substring of characters associated with said one of said plurality of inner nodes; (c) a second data field containing a comparison character, said comparison character used to determine whether said character or substring of characters associated with said one of said plurality of inner nodes is contained in a string entry at a character position of said string entry associated with said character comparison position; (d) a reference to at least two successor nodes; and (e) a hash table array containing a predetermined number of hash buckets. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
10. A method for searching for a certain string, consisting of a plurality of characters, among a plurality of strings, wherein each one of said plurality of strings is a string of characters, comprising the steps of:
-
creating on an R/3 system a search tree having a plurality of linked nodes consisting of a root node, a plurality of inner nodes wherein each one of said plurality of inner nodes is associated with a character or substring of characters, and a plurality of leaf nodes associated with said plurality of strings, wherein said each one of said plurality of inner nodes comprises; (a) a reference to a parent node, wherein said parent node is either said root node or another of said plurality of inner nodes; (b) a first data field containing a character comparison position indicating the number of characters in said character or substring of characters associated with said one of said plurality of inner nodes; (c) a second data field containing a comparison character, said comparison character used to determine whether said character or substring of characters associated with said one of said plurality of inner nodes is contained in a string entry at a character position of said string entry associated with said character comparison position; (d) a reference to at least two successor nodes; and (e) a hash table array containing a predetermined number of hash buckets; forming on an R/3 system a query associated with said certain string; and traversing said search tree utilizing said query and providing an index to any of said plurality of strings that matches said certain string.
-
-
11. A method for indexing a plurality of string entries, wherein each one of said plurality of string entries is a string of characters, comprising the steps of:
-
creating a search tree comprising a plurality of linked nodes comprising a root node, a plurality of inner nodes wherein each one of said plurality of inner nodes is associated with a character or substring of characters, and a plurality of leaf nodes associated with said plurality of string entries, wherein said each one of said plurality of inner nodes further comprises; (a) a reference to a parent node, wherein said parent node is either said root node or another of said plurality of inner nodes; (b) a first data field containing a character comparison position indicating the number of characters in said character or substring of characters associated with said one of said plurality of inner nodes; (c) a second data field containing a comparison character, said comparison character used to determine whether said character or substring of characters associated with said one of said plurality of inner nodes is contained in a string entry at a character position of said string entry associated with said character comparison position; (d) a reference to at least two successor nodes; and (e) a hash table array containing a predetermined number of hash buckets; and applying said search tree to said indexing of said plurality of string entries. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19)
-
-
20. A method for searching for a certain string, consisting of a plurality of characters, among a plurality of strings, wherein each one of said plurality of strings is a string of characters, comprising the steps of:
-
creating a search tree comprising a plurality of linked nodes comprising a root node, a plurality of inner nodes wherein each one of said plurality of inner nodes is associated with a character or substring of characters, and a plurality of leaf nodes associated with said plurality of string entries, wherein said each one of said plurality of inner nodes further comprises; (a) a reference to a parent node, wherein said parent node is either said root node or another of said plurality of inner nodes; (b) a first data field containing a character comparison position indicating the number of characters in said character or substring of characters associated with said one of said plurality of inner nodes; (c) a second data field containing a comparison character, said comparison character used to determine whether said character or substring of characters associated with said one of said plurality of inner nodes is contained in a string entry at a character position of said string entry associated with said character comparison position; (d) a reference to at least two successor nodes; and (e) a hash table array containing a predetermined number of hash buckets; wherein said plurality of strings forms a plurality of string entries; forming a query associated with said certain string; and traversing said search tree utilizing said query and providing an index to any of said plurality of strings that matches said certain string. - View Dependent Claims (21, 22, 23, 24, 25)
-
-
26. A method for collision detection or collision management of requests of several users accessing a database including a first plurality of string entries, comprising maintenance of a dynamic lock table including the requests of the users, comprising the steps of:
- indexing said requests as a second plurality of string entries, wherein each one of said second plurality of string entries is a string of characters, by using a search tree, said search tree comprising;
a plurality of linked nodes comprising a root node, a plurality of inner nodes wherein each one of said plurality of inner nodes is associated with a character or substring of characters, and a plurality of leaf nodes associated with said second plurality of string entries, wherein said each one of said plurality of inner nodes further comprises; (a) a reference to a parent node, wherein said parent node is either said root node or another of said plurality of inner nodes; (b) a first data field containing a character comparison position indicating the number of characters in said character or substring of characters associated with said one of said plurality of inner nodes; (c) a second data field containing a comparison character, said comparison character used to determine whether said character or substring of characters associated with said one of said plurality of inner nodes is contained in a string entry at a character position of said string entry associated with said character comparison position; (d) a reference to at least two successor nodes; and (e) a hash table array containing a predetermined number of hash buckets; and performing a query to the lock table associated with a request of a user to the database before passing the request of the user to the database.
- indexing said requests as a second plurality of string entries, wherein each one of said second plurality of string entries is a string of characters, by using a search tree, said search tree comprising;
-
27. A computer-readable medium having stored thereon a plurality of instructions, said plurality of instructions including instructions which, when executed by a processor, cause said processor to index a plurality of string entries wherein each one of said plurality of string entries is a string of characters, by using a search tree, said search tree comprising:
a plurality of linked nodes comprising a root node, a plurality of inner nodes wherein each one of said plurality of inner nodes is associated with a character or substring of characters, and a plurality of leaf nodes associated with said plurality of string entries, wherein said each one of said plurality of inner nodes further comprises; (a) a reference to a parent node, wherein said parent node is either said root node or another of said plurality of inner nodes; (b) a first data field containing a character comparison position indicating the number of characters in said character or substring of characters associated with said one of said plurality of inner nodes; (c) a second data field containing a comparison character, said comparison character used to determine whether said character or substring of characters associated with said one of said plurality of inner nodes is contained in a string entry at a character position of said string entry associated with said character comparison position; (d) a reference to at least two successor nodes; and (e) a hash table array containing a predetermined number of hash buckets. - View Dependent Claims (28, 29, 30, 31, 32, 33, 34, 35)
-
36. A computer-readable medium having stored thereon a plurality of instructions, said plurality of instructions including instructions which, when executed by a processor, causes said processor to search for a certain string, consisting of a plurality of characters, among a plurality of strings, wherein each one of said plurality of strings is a string of characters, comprising the steps of:
creating a search tree comprising a plurality of linked nodes comprising a root node, a plurality of inner nodes wherein each one of said plurality of inner nodes is associated with a character or substring of characters, and a plurality of leaf nodes associated with said plurality of string entries, wherein said each one of said plurality of inner nodes further comprises; (a) a reference to a parent node, wherein said parent node is either said root node or another of said plurality of inner nodes; (b) a first data field containing a character comparison position indicating the number of characters in said character or substring of characters associated with said one of said plurality of inner nodes; (c) a second data field containing a comparison character, said comparison character used to determine whether said character or substring of characters associated with said one of said plurality of inner nodes is contained in a string entry at a character position of said string entry associated with said character comparison position; (d) a reference to at least two successor nodes; and (e) a hash table array containing a predetermined number of hash buckets; wherein said plurality of strings forms a plurality of string entries; forming a query associated with said certain string; and traversing said search tree utilizing said query and providing an index to any of said plurality of strings that matches said certain string. - View Dependent Claims (37, 38, 39, 40, 41)
-
42. A computer-readable medium having stored thereon a plurality of instructions, said plurality of instructions including instructions which, when executed by a processor, cause said processor to detect or manage a collision of requests of several users accessing a database including a first plurality of string entries, by:
-
maintaining a dynamic lock table including requests of the users; indexing said requests as a second plurality of string entries, wherein each one of said second plurality of string entries is a string of characters, by using a search tree, said search tree comprising; a plurality of linked nodes comprising a root node, a plurality of inner nodes wherein each one of said plurality of inner nodes is associated with a character or substring of characters, and a plurality of leaf nodes associated with said second plurality of string entries, wherein said each one of said plurality of inner nodes further comprises; (a) a reference to a parent node, wherein said parent node is either said root node or another of said plurality of inner nodes; (b) a first data field containing a character comparison position indicating the number of characters in said character or substring of characters associated with said one of said plurality of inner nodes; (c) a second data field containing a comparison character, said comparison character used to determine whether said character or substring of characters associated with said one of said plurality of inner nodes is contained in a string entry at a character position of said string entry associated with said character comparison position; (d) a reference to at least two successor nodes; and (e) a hash table array containing a predetermined number of hash buckets; and performing a query to the lock table associated with a request of a user to the database before passing the request of the user to the database.
-
-
43. A system for searching a certain string, consisting of a plurality of characters, among a plurality of strings, wherein each one of said plurality of strings is a string of characters, comprising:
-
means for creating a search tree having a plurality of linked nodes comprising a root node, a plurality of inner nodes wherein each one of said plurality of inner nodes is associated with a character or substring of characters, and a plurality of leaf nodes associated with said plurality of strings, wherein said each one of said plurality of inner nodes comprises; (a) a reference to a parent node, wherein said parent node is either said root node or another of said plurality of inner nodes; (b) a first data field containing a character comparison position indicating the number of characters in said character or substring of characters associated with said one of said plurality of inner nodes; (c) a second data field containing a comparison character, said comparison character used to determine whether said character or substring of characters associated with said one of said plurality of inner nodes is contained in a string entry at a character position of said string entry associated with said character comparison position; (d) a reference to at least two successor nodes; and (e) a hash table array containing a predetermined number of hash buckets; means for submitting a query associated with said certain string; and tree traversal means for providing an index to any of said plurality of strings that matches said certain string. - View Dependent Claims (44, 45, 46, 47, 48)
-
Specification