Systems and methods for indexing each level of the inner structure of a string over a language having a vocabulary and a grammar
First Claim
1. A method for indexing at least one string over a language having a vocabulary and a grammar, comprising:
- dividing, for each string, that string into a plurality of component parts according to the grammar;
determining a bit index for each component part; and
unioning, for each at least one set of component parts, the bit indices of the component parts within that set of component parts into a bit index for that set of component parts.
3 Assignments
0 Petitions
Accused Products
Abstract
Systems and methods for indexing and searching the inner structure of a string over a language having a vocabulary and a grammar using bit vectors. The index preserves the inner gramatical structure of the string while allowing for a fast search. A single search provides immediate access to every level of a document, without having to re-search a single string to determine which sub-parts of that string match the search string. When a string is indexed, the index maintains a compositional representation and the grammatical relationship between the elements of the vocabulary according to the language. The string is then indexed on different levels by disregarding some of the grammatical relationships of component levels.
-
Citations
34 Claims
-
1. A method for indexing at least one string over a language having a vocabulary and a grammar, comprising:
-
dividing, for each string, that string into a plurality of component parts according to the grammar;
determining a bit index for each component part; and
unioning, for each at least one set of component parts, the bit indices of the component parts within that set of component parts into a bit index for that set of component parts. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 32)
-
-
18. A string indexing system that indexes at least one string over a language having a vocabulary and a grammar, comprising:
-
a dividing circuit, routine, or application that inputs the at least one string and outputs at least one component part;
an element index assigning circuit, routine, or application that assigns an index value to each component part;
an index converting circuit, routine, or application that converts each index value into a corresponding bit index; and
a unioning circuit, routine, or application that inputs two or more bit indices and outputs a bit index representing the union of the input two or more bit indices. - View Dependent Claims (19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 33)
-
-
34. A storage medium storing a set of program instructions executable on a data processing device and usable for indexing at least one string over a language having a vocabulary and a grammar, the set of program instructions comprising:
-
instructions for dividing, for each string, that string into a plurality of component parts according to the grammar;
instructions for determining, a bit index for each component part; and
instructions for unioning, for each at least one set of component parts, the bit indices of the component parts within that set of component parts into a bit index for that set of component parts.
-
Specification