Indexing strategy with improved DML performance and space usage for node-aware full-text search over XML
First Claim
1. A computer-implemented method comprising:
- storing a table that stores data for a plurality of nodes in one or more XML documents, the table comprising an entry for each node of the plurality of nodes, the entry of each node comprising;
an order key that specifies a hierarchical position of the node within the one or more XML documents;
an indication of a name of the node;
wherein the table comprises at least a first entry for a first node and a second entry for a second node, wherein the first node has a first node text value but does not have any descendant nodes, and wherein the second node has a second node text value and has one or more descendant nodes;
wherein the first entry for the first node comprises the first node text value;
wherein the second entry for the second node comprises a null node text value;
the table further comprising a third entry comprising the second node text value; and
storing an index of the node text values stored in the entries of the table;
wherein the method is performed by one or more computing devices.
1 Assignment
0 Petitions
Accused Products
Abstract
Techniques are provided for searching within a collection of XML documents. A relational table stores an entry for each node of a set of nodes in a collection of XML documents. Each entry of the relational table stores an order key and a path identifier along with the atomized value of the node. Instead of storing the atomized value in a full-text index, a virtual column can be created to represent, for each node, the atomized value of the node. Alternately, each entry of the relational table stores an order key and a path identifier along with, for simple nodes, the atomized value, and for complex nodes, a null value. For a complex node with a descendant text node, a separate entry is stored for the descendant text node in the relational table.
107 Citations
20 Claims
-
1. A computer-implemented method comprising:
-
storing a table that stores data for a plurality of nodes in one or more XML documents, the table comprising an entry for each node of the plurality of nodes, the entry of each node comprising; an order key that specifies a hierarchical position of the node within the one or more XML documents; an indication of a name of the node; wherein the table comprises at least a first entry for a first node and a second entry for a second node, wherein the first node has a first node text value but does not have any descendant nodes, and wherein the second node has a second node text value and has one or more descendant nodes; wherein the first entry for the first node comprises the first node text value; wherein the second entry for the second node comprises a null node text value; the table further comprising a third entry comprising the second node text value; and storing an index of the node text values stored in the entries of the table; wherein the method is performed by one or more computing devices. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A computer-implemented method comprising:
-
receiving a query for specified text in one or more XML documents; wherein said one or more XML documents comprise a plurality of nodes; in response to receiving the query; determining that the one or more XML documents are stored in a table comprising a plurality of rows, each row of said plurality of rows corresponding to a node of said plurality of nodes; wherein each row of said plurality of rows includes; an order key that indicates a hierarchical position of the node within the one or more XML documents, said order key being stored in a first column of said table; and an indication of a name of the node, said indication stored in a second column of said table; wherein the table comprises at least a first row for a first node and a second row for a second node, wherein the first node has a first node text value but does not have any descendant nodes, and wherein the second node has a second node text value and has one or more descendant nodes; wherein the first row for the first node comprises the first node text value; wherein the second row for the second node comprises a null node text value; the table further comprising a third row comprising the second node text value; and searching the node text values stored in the rows of the table for the specified text to determine at least one of a particular order key or a particular name of a particular node that contains the specified text; wherein the method is performed by one or more computing devices. - View Dependent Claims (9, 10)
-
-
11. A volatile or non-volatile computer-readable storage medium storing one or more sequences of instruction, wherein execution of the one or more sequences of instruction by one or more processors causes the one or more processors to perform:
-
storing a table that stores data for a plurality of nodes in one or more XML documents, the table comprising an entry for each node of the plurality of nodes, the entry of each node comprising; an order key that specifies a hierarchical position of the node within the one or more XML documents; an indication of a name of the node; wherein the table comprises at least a first entry for a first node and a second entry for a second node, wherein the first node has a first node text value but does not have any descendant nodes, and wherein the second node has a second node text value and has one or more descendant nodes; wherein the first entry for the first node comprises the first node text value; wherein the second entry for the second node comprises a null node text value; the table further comprising a third entry comprising the second node text value; and storing an index of the node text values stored in the entries of the table. - View Dependent Claims (12, 13, 14, 15, 16, 17)
-
-
18. A volatile or non-volatile computer-readable storage medium storing one or more sequences of instruction, wherein execution of the one or more sequences of instruction by one or more processors causes the one or more processors to perform:
-
receiving a query for specified text in one or more XML documents; wherein said one or more XML documents comprise a plurality of nodes; in response to receiving the query; determining that the one or more XML documents are stored in a table comprising a plurality of rows, each row of said plurality of rows corresponding to a node of said plurality of nodes; wherein each row of said plurality of rows includes; an order key that indicates a hierarchical position of the node within the one or more XML documents, said order key being stored in a first column of said table; and an indication of a name of the node, said indication stored in a second column of said table; wherein the table comprises at least a first row for a first node and a second row for a second node, wherein the first node has a first node text value but does not have any descendant nodes, and wherein the second node has a second node text value and has one or more descendant nodes; wherein the first row for the first node comprises the first node text value; wherein the second row for the second node comprises a null node text value; the table further comprising a third row comprising the second node text value; and searching the node text values stored in the rows of the table for the specified text to determine at least one of a particular order key or a particular name of a particular node that contains the specified text; wherein the method is performed by one or more computing devices. - View Dependent Claims (19, 20)
-
Specification