Indexing Strategy With Improved DML Performance and Space Usage for Node-Aware Full-Text Search Over XML
First Claim
1. A computer-implemented method comprising:
- storing a table that stores data for a plurality of nodes in a collection of XML documents, the table comprising an entry for each node of the plurality of nodes, the entry of each node comprising;
an order key that specifies a hierarchical position of the node within the XML document;
path data that specifies a path, through the structure of the XML document, to the node;
for a simple node, an atomized value of the node;
for a complex node, a null value;
for a mixed content node, storing a separate entry for a descendant text node of the mixed content node, the entry comprising the text value of the descendant text node; and
storing a full-text index of the values stored in the entries of the table.
1 Assignment
0 Petitions
Accused Products
Abstract
Techniques are provided for searching within a collection of XML documents. A relational table stores an entry for each node of a set of nodes in a collection of XML documents. Each entry of the relational table stores an order key and a path identifier along with the atomized value of the node. Instead of storing the atomized value in a full-text index, a virtual column can be created to represent, for each node, the atomized value of the node. Alternately, each entry of the relational table stores an order key and a path identifier along with, for simple nodes, the atomized value, and for complex nodes, a null value. For a complex node with a descendant text node, a separate entry is stored for the descendant text node in the relational table.
-
Citations
17 Claims
-
1. A computer-implemented method comprising:
-
storing a table that stores data for a plurality of nodes in a collection of XML documents, the table comprising an entry for each node of the plurality of nodes, the entry of each node comprising; an order key that specifies a hierarchical position of the node within the XML document; path data that specifies a path, through the structure of the XML document, to the node; for a simple node, an atomized value of the node; for a complex node, a null value; for a mixed content node, storing a separate entry for a descendant text node of the mixed content node, the entry comprising the text value of the descendant text node; and storing a full-text index of the values stored in the entries of the table. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 17)
-
-
9. A computer-implemented method comprising:
-
receiving a query that requests data from a collection of XML documents; wherein said collection of XML documents comprises a plurality of nodes; wherein a certain index comprises a table that comprises a plurality of rows, each row of said plurality of rows corresponding to a node of said plurality of nodes; wherein each row of said plurality of rows includes; an order key that indicates a hierarchical position of the node within the XML document, said order key being stored in a order key column of said table; and path data that specifies a path, through the structure of the XML document, to the node, said path identifier being stored in a path identifier column of said table; for a simple node, an atomized value of the node, said atomized value being contained in a value column of said table; for a complex node, a null value, said null value being contained in the value column of said table; for a mixed content node, storing a separate entry for a descendant text node of the mixed content node, the entry storing the text value of the descendant text node in the value column of said table; and wherein said certain index of said XML collection comprises a full-text index, said value column being indexed by said full-text index; rewriting said query to produce a rewritten query, wherein said rewritten query causes using said full-text index to compute said rewritten query. - View Dependent Claims (10, 11, 12)
-
-
13. A computer-implemented method comprising:
-
receiving a query that requests data from a collection of XML documents; wherein said collection of XML documents comprises a plurality of nodes; wherein a certain index comprises a table that stores data for a plurality of nodes in a collection of XML documents, the table comprising a value column, the value column comprising; for a simple node, an atomized value of the node; for a complex node, a null value; for a mixed content node, the table storing a separate entry for a descendant text node of the mixed content node, the separate entry storing a text value of the descendant text node in the value column; and wherein said value column is indexed by a full-text index; rewriting said query to produce a rewritten query, wherein said rewritten query causes using said full-text index to compute said rewritten query. - View Dependent Claims (14, 15, 16)
-
Specification