Indexing and querying semi-structured documents
First Claim
Patent Images
1. A method for indexing a semi-structured document, the method comprising:
- arranging at least one structure entity of a semi-structured document into at least one node of a context structure tree;
associating a unique context identifier with any of said structure entities;
for any value of any of said structure entities, creating a context-modified value by appending a context delimiter and said context identifier to said value; and
inserting said context-modified value into a free-text tree.
2 Assignments
0 Petitions
Accused Products
Abstract
A method for indexing a semi-structured document, the method including arranging at least one structure entity of a semi-structured document into at least one node of a context structure tree, associating a unique context identifier with any of the structure entities, creating, for any value of any of the structure entities, a context-modified value by appending a context delimiter and the context identifier to the value, and inserting the context-modified value into a free-text tree.
73 Citations
24 Claims
-
1. A method for indexing a semi-structured document, the method comprising:
-
arranging at least one structure entity of a semi-structured document into at least one node of a context structure tree;
associating a unique context identifier with any of said structure entities;
for any value of any of said structure entities, creating a context-modified value by appending a context delimiter and said context identifier to said value; and
inserting said context-modified value into a free-text tree. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A method for querying semi-structured document indices, the method comprising:
-
traversing, in a context structure index, one or more context nodes corresponding to a context path of a query until a context node corresponding to a terminus of said context path is reached;
retrieving a context identifier of said context node;
appending a context delimiter followed by said context identifier to a value of said query, thereby forming a context-modified value;
traversing, in a free-text index, one or more text nodes corresponding to said context-modified value until said traversed text nodes form said context-modified value; and
retrieving any links associated with any of said text nodes corresponding to either of said context delimiter and said context identifier node, thereby forming results of said query. - View Dependent Claims (8)
-
-
9. A method for querying semi-structured document indices, the method comprising:
-
appending a context delimiter followed to a value of a query, thereby forming a context-modified value;
traversing, in a free-text index, one or more text nodes corresponding to said context-modified value until said traversed text nodes form said context-modified value; and
retrieving any links associated with any of said text nodes corresponding to said context delimiter, thereby forming results of said query. - View Dependent Claims (10)
-
-
11. A method for querying semi-structured document indices, the method comprising:
-
traversing, in a context structure index, one or more context nodes corresponding to a context path of a query until a context node corresponding to a terminus of said context path is reached;
retrieving a context identifier of said context node;
traversing, in a free-text index, one or more text nodes corresponding to a value of said query, wherein said value is of a context-specific wildcard query construct, until said traversed text nodes form said value; and
retrieving any links associated with any text nodes of said free-text index that descend from the terminus of said traversed value and that are at the desired context identifier, thereby forming results of said query.
-
-
12. Apparatus for indexing a semi-structured document, comprising:
-
a context structure tree comprising at least one node corresponding to at least one structure entity of a semi-structured document and a unique context identifier associated with said structure entity;
a context-modified value comprising a value of said structure entity, a context delimiter, and said context identifier; and
a free-text tree into which said context-modified value is inserted.
-
-
13. A system for indexing a semi-structured document, the system comprising:
-
means for arranging at least one structure entity of a semi-structured document into at least one node of a context structure tree;
means for associating a unique context identifier with any of said structure entities;
means for creating a context-modified value for any value of any of said structure entities by appending a context delimiter and said context identifier to said value; and
means for inserting said context-modified value into a free-text tree. - View Dependent Claims (14, 15, 16, 17, 18)
-
-
19. A system for querying semi-structured document indices, the system comprising:
-
means for traversing, in a context structure index, one or more context nodes corresponding to a context path of a query until a context node corresponding to a terminus of said context path is reached;
means for retrieving a context identifier of said context node;
means for appending a context delimiter followed by said context identifier to a value of said query, thereby forming a context-modified value;
means for traversing, in a free-text index, one or more text nodes corresponding to said context-modified value until said traversed text nodes form said context-modified value; and
means for retrieving any links associated with any of said text nodes corresponding to either of said context delimiter and said context identifier node, thereby forming results of said query. - View Dependent Claims (20)
-
-
21. A system for querying semi-structured document indices, the system comprising:
-
means for appending a context delimiter followed to a value of a query, thereby forming a context-modified value;
means for traversing, in a free-text index, one or more text nodes corresponding to said context-modified value until said traversed text nodes form said context-modified value; and
means for retrieving any links associated with any of said text nodes corresponding to said context delimiter, thereby forming results of said query. - View Dependent Claims (22)
-
-
23. A system for querying semi-structured document indices, the system comprising:
-
means for traversing, in a context structure index, one or more context nodes corresponding to a context path of a query until a context node corresponding to a terminus of said context path is reached;
means for retrieving a context identifier of said context node;
means for traversing, in a free-text index, one or more text nodes corresponding to a value of said query, wherein said value is of a context-specific wildcard query construct, until said traversed text nodes form said value; and
means for retrieving any links associated with any text nodes of said free-text index that descend from the terminus of said traversed value and that are at the desired context identifier, thereby forming results of said query.
-
-
24. A computer program embodied on a computer-readable medium, the computer program comprising:
-
a first code segment operative to arrange at least one structure entity of a semi-structured document into at least one node of a context structure tree;
a second code segment operative to associate a unique context identifier with any of said structure entities;
a third code segment operative to create a context-modified value for any value of any of said structure entities by appending a context delimiter and said context identifier to said value; and
a fourth code segment operative to insert said context-modified value into a free-text tree.
-
Specification