SYSTEM AND METHOD FOR SEARCH, INDEX, PARSING DOCUMENT DATABASE INCLUDING SUBJECT DOCUMENT HAVING NESTED FIELDS ASSOCIATED START AND END META WORDS WHERE EACH META WORD IDENTIFY LOCATION AND NESTING LEVEL
First Claim
1. A computer-implemented method of indexing a database of documents, a subset of the documents containing nested fields, each nested field having an associated start meta word and end meta word, each meta word having an associated nesting level, the method comprising:
- indexing each document containing nested fields by;
parsing the document to determine locations within the document of words and meta words in the document and to determine the nesting level associated with each meta word; and
generating an index including word entries, each word entry identifying locations within the document of an identified word;
meta word entries, each meta word entry identifying locations within the document of an identified meta word and indicating the determined nesting level associated with the meta word; and
generic meta word entries, each generic meta word entry identifying locations within the document of a class of meta words, including meta words at all nesting levels of the meta words found in the document, the generic meta word entry including, for each identified location within the generic meta word entry, information identifying the nesting level associated with the meta word at the identified location.
8 Assignments
0 Petitions
Accused Products
Abstract
An indexer indexes a database of documents, and a search engine searches the database of documents. Nesting level information stored in index entries is used to identify, and match together, start and end meta words comprising fields at assorted nesting levels within a document. Based on a query specifying words to be found within fields, spatial criteria are applied to the identified meta words to determine if the specified words are found within the specified fields. A subset of the documents have nested fields, and each nested field has an associated start meta word and end meta word. Each meta word has an associated nesting level. Each document is indexed by parsing the document to determine locations within the document of words and meta words, as well as the nesting level associated with each meta word. An index is generated that has word entries, meta word entries, and generic meta word entries. The meta word entries indicate locations within the documents of an identified meta word, as well as the nesting level of the meta word. The generic meta word entries identify locations within the document of a class of meta words, including meta words at all nesting levels of the meta words within the document. For each identified location within the generic meta word entry, the generic meta word entry also includes nesting level information associated with the meta word at the identified location.
-
Citations
65 Claims
-
1. A computer-implemented method of indexing a database of documents, a subset of the documents containing nested fields, each nested field having an associated start meta word and end meta word, each meta word having an associated nesting level, the method comprising:
-
indexing each document containing nested fields by;
parsing the document to determine locations within the document of words and meta words in the document and to determine the nesting level associated with each meta word; and generating an index including word entries, each word entry identifying locations within the document of an identified word;
meta word entries, each meta word entry identifying locations within the document of an identified meta word and indicating the determined nesting level associated with the meta word; and
generic meta word entries, each generic meta word entry identifying locations within the document of a class of meta words, including meta words at all nesting levels of the meta words found in the document, the generic meta word entry including, for each identified location within the generic meta word entry, information identifying the nesting level associated with the meta word at the identified location. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A computer-implemented method of searching a database of documents, a subset of the documents containing nested fields, each nested field having an associated start meta word and end meta word, each meta word having an associated nesting level, the method comprising:
-
receiving a query that specifies one or more words to be found within a specified field within a document;
determining a start meta word and end meta word associated with the specified field;
searching an index to identify locations of the specified words and locations of a class of meta words that includes at least one of the start meta word and end meta word associated with the specified field;
applying first spatial criteria to the identified locations of the class of meta words with respect to the identified locations of the specified words to select a meta word from the class of meta words;
determining the nesting level of the selected meta word;
identifying a complementary meta word corresponding to the selected meta word;
searching the index to determine a location for the identified complementary meta word; and
applying second spatial criteria to the identified locations of the specified words and to the determined location for the identified complementary meta word to generate a result that indicates whether the specified words are found within a first field associated with the selected meta word and the identified complementary meta word. - View Dependent Claims (8, 9, 10, 11, 12, 13, 14, 15)
-
-
16. A computer-implemented method of searching a database of documents, a subset of the documents containing nested fields, each nested field having an associated start meta word and end meta word, each meta word having an associated nesting level, the method comprising:
-
receiving a query that specifies one or more words to be found within a first specified field that is found within a second specified field within a document;
determining a first start meta word and first end meta word associated with the first specified field, and a second start meta word and second end meta word associated with the second specified field;
searching an index to identify;
locations of the specified words, locations of a first class of meta words that includes at least one of the first start meta word and first end meta word associated with the first specified field, and locations of a second class of meta words that includes at least one of the second start meta word and second end meta word associated with the second specified field;
applying first spatial criteria, determined at least in part from the received query, to the identified locations of the first and second classes of meta words and the identified locations of the specified words to select a first meta word from the first class of meta words, and a second meta word from the second class of meta words;
determining the nesting levels of the first and second selected meta words;
identifying a first and second complementary meta words, corresponding to the first and second selected meta words;
searching the index to determine a location for the first identified complementary meta word and a location for the second identified complementary meta word; and
applying second spatial criteria, determined from the received query, to the identified locations of the specified words and to the determined locations for the first and second identified complementary meta words to generate a result that indicates whether the specified words are found within a first field, associated with the first selected meta word and the first identified complementary meta word, that is found within a second field, associated with the second selected meta word and the second identified complementary meta word. - View Dependent Claims (17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28)
-
-
29. A computer-implemented method for searching an index of a database of documents, the index having entries, each entry including an object identifier and a location list, each object identifier including at least one of a word and a meta word, each location list including one or more locations of the at least one of a word and a meta word of each corresponding object identifier, each entry associated with a meta word also including nesting level information for the meta word, the computer-implemented method comprising:
-
receiving a query that specifies one or more words to be found within a specified field within a document;
determining a start meta word and end meta word associated with the specified field;
identifying a bounding meta word by selecting one of the start meta word and end meta word;
searching the index to identify a first entry that has an object identifier associated with the specified words;
searching the index to identify a second entry that has an object identifier associated with the bounding meta word;
determining a bounding location from a closest occurrence of the bounding meta word with respect to the specified words, by comparing the location list of the second entry and the location list of the first entry;
identifying nesting level information for the bounding meta word at the bounding location;
identifying a complementary meta word to the bounding meta word having corresponding nesting level information as the identified nesting level information for the bounding meta word;
searching the index to locate a third entry that has an object identifier associated with the complementary meta word;
determining a complementary location from the location list of the third entry; and
generating a result that indicates whether the specified words are within a first field, associated with the bounding meta word and the complementary meta word, by determining whether a location in the location list of the first entry falls between the bounding location and the complementary location. - View Dependent Claims (30, 31, 32, 33)
-
-
34. A computer program product for use in conjunction with a computer system, the computer system for indexing a database of documents, a subset of the documents containing nested fields, each nested field having an associated start meta word and end meta word, each meta word having an associated nesting level, the computer program product comprising a computer readable storage medium and a computer program mechanism embedded therein, the computer program mechanism comprising:
-
an indexer for indexing each document containing nested fields by configuring the computer to;
parse the document to determine locations within the document of words and meta words in the document and to determine the nesting level associated with each meta word; and
generate an index including word entries, each word entry identifying locations within the document of an identified word;
meta word entries, each meta word entry identifying locations within the document of an identified meta word and indicating the determined nesting level associated with the meta word; and
generic meta word entries, each generic meta word entry identifying locations within the document of a class of meta words, including meta words at all nesting levels of the meta words found in the document, the generic meta word entry including, for each identified location within the generic meta word entry, information identifying the nesting level associated with the meta word at the identified location. - View Dependent Claims (35, 36, 37, 38)
-
-
39. A computer program product for use in conjunction with a computer system, the computer system for searching a database of documents, a subset of the documents containing nested fields, each nested field having an associated start meta word and end meta word, each meta word having an associated nesting level, the computer program product comprising a computer readable storage medium and a computer program mechanism embedded therein, the computer program mechanism comprising:
-
instructions for receiving a query that specifies one or more words to be found within a specified field within a document;
instructions for determining a start meta word and end meta word associated with the specified field;
instructions for searching an index to identify locations of the specified words and locations of a class of meta words that includes at least one of the start meta word and end meta word associated with the specified field;
instructions for applying first spatial criteria to the identified locations of the class of meta words with respect to the identified locations of the specified words to select a meta word from the class of meta words;
instructions for determining the nesting level of the selected meta word;
instructions for identifying a complementary meta word corresponding to the selected meta word;
instructions for searching the index to determine a location for the identified complementary meta word; and
instructions for applying second spatial criteria to the identified locations of the specified words and to the determined location for the identified complementary meta word to generate a result that indicates whether the specified words are found within a first field associated with the selected meta word and the identified complementary meta word. - View Dependent Claims (40, 41, 42, 43, 44, 45, 46, 47)
-
-
48. A computer program product for use in conjunction with a computer system, the computer system for searching a database of documents, a subset of the documents containing nested fields, each nested field having an associated start meta word and end meta word, each meta word having an associated nesting level, the computer program product comprising a computer readable storage medium and a computer program mechanism embedded therein, the computer program mechanism comprising:
-
instructions for receiving a query that specifies one or more words to be found within a first specified field that is found within a second specified field within a document;
instructions for determining a first start meta word and first end meta word associated with the first specified field, and a second start meta word and second end meta word associated with the second specified field;
instructions for searching an index to identify;
locations of the specified words, locations of a first class of meta words that includes at least one of the first start meta word and first end meta word associated with the first specified field, and locations of a second class of meta words that includes at least one of the second start meta word and second end meta word associated with the second specified field;
instructions for applying first spatial criteria, determined at least in part from the received query, to the identified locations of the first and second classes of meta words and the identified locations of the specified words to select a first meta word from the first class of meta words, a second meta word from the second class of meta words;
instructions for determining the nesting levels of the first and second selected meta words;
instructions for identifying a first and second complementary meta words, corresponding to the first and second selected meta words, and searching the index to determine a location for the first identified complementary meta word and a location for the second identified complementary meta word; and
instructions for applying second spatial criteria, determined from the received query, to the identified locations of the specified words and to the determined locations for the first and second identified complementary meta words to generate a result that indicates whether the specified words are found within a first field, associated with the first selected meta word and the first identified complementary meta word, that is found within a second field, associated with the second selected meta word and the second identified complementary meta word. - View Dependent Claims (49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60)
-
-
61. A computer program product for use in conjunction with a computer system, the computer system for searching an index of a database of documents, the index having entries, each entry including an object identifier and a location list, each object identifier including at least one of a word and a meta word, each location list including one or more locations of the at least one of a word and a meta word of each corresponding object identifier, each entry associated with a meta word also including nesting level information for the meta word, the computer program product comprising a computer readable storage medium and a computer program mechanism embedded therein, the computer program mechanism comprising:
-
instructions for receiving a query that specifies one or more words to be found within a specified field within a document;
instructions for determining a start meta word and end meta word associated with the specified field;
instructions for identifying a bounding meta word by selecting one of the start meta word and end meta word;
instructions for searching the index to identify a first entry that has an object identifier associated with the specified words;
instructions for searching the index to identify a second entry that has an object identifier associated with the bounding meta word;
instructions for determining a bounding location from a closest occurrence of the bounding meta word with respect to the specified words, by comparing the location list of the second entry and the location list of the first entry;
instructions for identifying nesting level information for the bounding meta word at the bounding location;
instructions for identifying a complementary meta word to the bounding meta word having corresponding nesting level information as the identified nesting level information for the bounding meta word;
instructions for searching the index to locate a third entry that has an object identifier associated with the complementary meta word;
instructions for determining a complementary location from the location list of the third entry; and
instructions for generating a result that indicates whether the specified words are within a first field, associated with the bounding meta word and the complementary meta word, by determining whether a location in the location list of the first entry falls between the bounding location and the complementary location. - View Dependent Claims (62, 63, 64, 65)
-
Specification