Mechanism for improving performance on XML over XML data using path subsetting
First Claim
1. A computer-implemented method comprising:
- based on one or more path expressions that identify nodes that have corresponding index entries in an index, a DBMS maintaining;
the index that contains index entries for each of a set of nodes that are defined by index metadata, wherein the index metadata includes the one or more path expressions that identify nodes that have corresponding index entries in the index;
the index also including index entries for ancestor nodes of each node in the set of nodes, wherein the union of the set of nodes and all ancestor nodes comprises less than all nodes within a collection of XML documents; and
the DBMS receiving a query that includes a particular path expression;
based on the index metadata, the DBMS automatically determining at query compilation time whether the index may be used to evaluate the query;
wherein the DBMS automatically determining whether the index can be used comprises inspecting the index metadata to determine whether the nodes identified by the particular path expression are within the set of nodes indexed by the index;
in response to determining that the index may be used to evaluate the query, evaluating the query using the index; and
sending query results;
wherein the method is performed by one or more computing devices.
1 Assignment
0 Petitions
Accused Products
Abstract
Techniques are provided for indexing XML documents using path subsetting. According to one embodiment, a PATH table created for storing one row for each indexed node of the XML documents using user-defined criteria. The user-defined criteria are used to determine which nodes of XML documents to included in The PATH TABLE. The PATH table row for a node includes (1) information for locating the XML document that contains the node, (2) information that identifies the path of the node, and (3) information that identifies the position of the node within the hierarchical structure of the XML document that contains the node. Use of the user defined criteria is transparent to any query improves DML indexes overhead costs.
204 Citations
6 Claims
-
1. A computer-implemented method comprising:
-
based on one or more path expressions that identify nodes that have corresponding index entries in an index, a DBMS maintaining; the index that contains index entries for each of a set of nodes that are defined by index metadata, wherein the index metadata includes the one or more path expressions that identify nodes that have corresponding index entries in the index; the index also including index entries for ancestor nodes of each node in the set of nodes, wherein the union of the set of nodes and all ancestor nodes comprises less than all nodes within a collection of XML documents; and the DBMS receiving a query that includes a particular path expression; based on the index metadata, the DBMS automatically determining at query compilation time whether the index may be used to evaluate the query; wherein the DBMS automatically determining whether the index can be used comprises inspecting the index metadata to determine whether the nodes identified by the particular path expression are within the set of nodes indexed by the index; in response to determining that the index may be used to evaluate the query, evaluating the query using the index; and sending query results; wherein the method is performed by one or more computing devices. - View Dependent Claims (2)
-
-
3. A computer-readable volatile or non-volatile storage medium storing one or more sequences of instructions which, when executed by one or more processors, causes the one or more processors to perform:
-
based on one or more path expressions that identify nodes that have corresponding index entries in an index, a DBMS maintaining; the index that contains index entries for each of a set of nodes that are defined by index metadata, wherein the index metadata includes the one or more path expressions that identify nodes that are indexed; the index also including index entries for ancestor nodes of each node in the set of nodes, wherein the union of the set of nodes and all ancestor nodes comprises less than all nodes within a collection of XML documents; and the DBMS receiving a query that includes a particular path expression; based on the index metadata, the DBMS automatically determining at query compilation time whether the index may be used to evaluate the query; wherein the DBMS automatically determining whether the index can be used comprises inspecting the index metadata to determine whether the nodes identified by the particular path expression are within the set of nodes indexed by the index; in response to determining that the index may be used to evaluate the query, evaluating the query using the index; and sending query results. - View Dependent Claims (4)
-
-
5. A computer-implemented method comprising:
-
a DBMS maintaining; an index that contains index entries for at least one node within a collection of XML documents, wherein the index includes entries for less than all nodes within the collection of XML documents; wherein index metadata includes one or more path expressions that expressly identify a set of nodes within the collection of XML documents, wherein index entries corresponding to said set of nodes are excluded from the index; the DBMS receiving a query that includes a particular path expression; based on the index metadata, the DBMS automatically determining at query compilation time whether the index can be used to evaluate the query; wherein the DBMS automatically determining whether the index can be used comprises inspecting the index metadata to determine whether nodes identified by the particular path expression are in said set of nodes for which corresponding index entries are excluded from the index; in response to determining that the nodes identified by the particular path expression are in the set of nodes for which corresponding index entries are excluded from the index, evaluating the query without using the index; and sending query results; wherein the method is performed by one or more computing devices.
-
-
6. A computer-readable volatile or non-volatile storage medium comprising instructions, which when executed, cause one or more processors to perform:
-
a DBMS maintaining; an index that contains index entries for at least one node within a collection of XML documents, wherein the index includes entries for less than all nodes within the collection of XML documents; wherein index metadata includes one or more path expressions that expressly identify a set of nodes within the collection of XML documents, wherein index entries corresponding to said set of nodes are excluded from the index; the DBMS receiving a query that includes a particular path expression; based on the index metadata, the DBMS automatically determining at query compilation time whether the index can be used to evaluate the query; wherein the DBMS automatically determining whether the index can be used comprises inspecting the index metadata to determine whether nodes identified by the particular path expression are in said set of nodes for which corresponding index entries are excluded from the index; in response to determining that the nodes identified by the particular path expression are in the set of nodes for which corresponding index entries are excluded from the index, evaluating the query without using the index; and sending query results.
-
Specification