Indexing Mechanism for Efficient Node-Aware Full-Text Search Over XML
First Claim
1. A computer-implemented method comprising:
- storing a table that stores data for a plurality of nodes in a collection of XML documents, the table comprising an entry for each node of the plurality of nodes, the entry of each node comprising;
path data that specifies a path, through the structure of the XML document, to the node; and
an atomized value of the node;
storing a full-text index of the atomized values stored in the entries of the table.
1 Assignment
0 Petitions
Accused Products
Abstract
Techniques are provided for searching within a collection of XML documents. A relational table in an XML index stores an entry for each node of a set of nodes in the collection. Each entry of the relational table stores an order key and a path identifier along with the atomized value of the node. An index on the atomized value provides a mechanism to perform a node-aware full-text search. Instead of storing the atomized value in the table, a virtual column may be created to represent, for each node, the atomized value of the node. Alternately, each entry of the relational table stores an order key and a path identifier along with, for simple nodes, the atomized value, and for complex nodes, a null value. For a complex node with a descendant text node, a separate entry is stored for the descendant text node in the relational table.
136 Citations
11 Claims
-
1. A computer-implemented method comprising:
-
storing a table that stores data for a plurality of nodes in a collection of XML documents, the table comprising an entry for each node of the plurality of nodes, the entry of each node comprising; path data that specifies a path, through the structure of the XML document, to the node; and an atomized value of the node; storing a full-text index of the atomized values stored in the entries of the table. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A computer-implemented method comprising:
-
receiving a query that requests data from a collection of XML documents; wherein said collection of XML documents comprises a plurality of nodes; wherein a certain index comprises a table that comprises a plurality of rows, each row of said plurality of rows corresponding to a node of said plurality of nodes; wherein each row of said plurality of rows includes; path data that specifies a path, through the structure of the XML document, to the node, said path identifier being stored in a path data column of said table, and an atomized value of the node, said atomized value being contained in a virtual column of said table; wherein said certain index of said collection of XML documents comprises a full-text index, said virtual column being indexed by said full-text index; rewriting said query to produce a rewritten query, wherein said rewritten query causes using said full-text index to compute said rewritten query. - View Dependent Claims (7, 8)
-
-
9. A computer-implemented method comprising:
-
receiving a query that requests data from a collection of XML documents; wherein said collection of XML documents comprises a plurality of nodes; wherein a certain index comprises a table that stores data for a plurality of nodes in a collection of XML documents, the table comprising a virtual column that contains, for each node, an atomized value of the node; wherein said virtual column is indexed by a full-text index; rewriting said query to produce a rewritten query, wherein said rewritten query causes using said full-text index to compute said rewritten query. - View Dependent Claims (10, 11)
-
Specification