Hybrid binary XML storage model for efficient XML processing
First Claim
Patent Images
1. A method for storing an XML document of a plurality of documents, the method comprising steps of:
- storing in a persistent repository a persistent representation of the XML document that includes a navigable representation and a streamable representation that is separate from said navigable representation,wherein the XML document includes a tree of nodes in a hierarchical relationship, each node of the tree of nodes having an immediate hierarchical relationship with at least one other node in the tree of nodes,wherein the streamable representation contains nodes of the tree of nodes that are in document order,wherein the navigable representation contains a subset of nodes of the tree of nodes, the subset of nodes including less than all nodes of the tree of nodes, andwherein each particular node of the subset of nodes in the navigable representation includes at least one pointer to content, of said each particular node, that is contained in the streamable representation, and at least one pointer to another node of the subset of nodes in the navigable representation,said at least one pointer to the other node in the navigable representation being one of;
a pointer to a parent node of said each particular node,a pointer to a child node of said each particular node,a pointer to a sibling node of said each particular node, ora pointer to a previous sibling node of said each particular node;
wherein the steps are performed by one or more computing devices.
1 Assignment
0 Petitions
Accused Products
Abstract
A method for storing XML documents a hybrid navigation/streaming format is provided to allow efficient storage and processing of queries on the XML data that provides the benefits of both navigation and streaming and ameliorates the disadvantages of each. Each XML document to be stored is independently analyzed to determine a combination of navigable and streamable storage format that optimizes the processing of the data for anticipated access patterns.
-
Citations
38 Claims
-
1. A method for storing an XML document of a plurality of documents, the method comprising steps of:
-
storing in a persistent repository a persistent representation of the XML document that includes a navigable representation and a streamable representation that is separate from said navigable representation, wherein the XML document includes a tree of nodes in a hierarchical relationship, each node of the tree of nodes having an immediate hierarchical relationship with at least one other node in the tree of nodes, wherein the streamable representation contains nodes of the tree of nodes that are in document order, wherein the navigable representation contains a subset of nodes of the tree of nodes, the subset of nodes including less than all nodes of the tree of nodes, and wherein each particular node of the subset of nodes in the navigable representation includes at least one pointer to content, of said each particular node, that is contained in the streamable representation, and at least one pointer to another node of the subset of nodes in the navigable representation, said at least one pointer to the other node in the navigable representation being one of; a pointer to a parent node of said each particular node, a pointer to a child node of said each particular node, a pointer to a sibling node of said each particular node, or a pointer to a previous sibling node of said each particular node; wherein the steps are performed by one or more computing devices. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19)
-
-
20. One or more non-transitory storage media storing instructions which, when executed by one or more computing devices, cause a method for storing an XML document of a plurality of documents, the method comprising steps of:
-
storing in a persistent repository a persistent representation of the XML document that includes a navigable representation and a streamable representation that is separate from said navigable representation, wherein the XML document includes a tree of nodes in a hierarchical relationship, each node of the tree of nodes having an immediate hierarchical relationship with at least one other node in the tree of nodes, wherein the streamable representation contains nodes of the tree of nodes that are in document order, wherein the navigable representation contains a subset of nodes of the tree of nodes, the subset of nodes including less than all nodes of the tree of nodes, and wherein each particular node of the subset of nodes in the navigable representation includes at least one pointer to content, of said each particular node, that is contained in the streamable representation, and at least one pointer to another node of the subset of nodes in the navigable representation, said at least one pointer to the other node in the navigable representation being one of; a pointer to a parent node of said each particular node, a pointer to a child node of said each particular node, a pointer to a sibling node of said each particular node, or a pointer to a previous sibling node of said each particular node; wherein the steps are performed by one or more computing devices. - View Dependent Claims (21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38)
wherein, for each node of said one or more nodes, the navigable representation contains a pointer to said each node contained in the streamable representation.
-
-
29. The one or more non-transitory storage media of claim 20,
wherein one or more nodes are contained in the navigable representation and not contained in the streamable representation; - and
wherein, for each node of said one or more nodes, the navigable representation contains a pointer to a text segment stored separate from the navigable representation.
- and
-
30. The one or more non-transitory storage media of claim 20,
wherein the XML document is a first XML document that conforms to an XML schema and a second XML document conforms to said XML schema; - and
wherein for a particular XPath expression, at least one node that matches the particular XPath expression in the first XML document is contained within the navigable representation of the first XML document, and at least one node in the second XML document that matches the particular XPath expression is not contained within the navigable representation of the second XML document.
- and
-
31. The one or more non-transitory storage media of claim 20, wherein the method further comprises:
evaluating a path expression by; retrieving a pointer to a certain node in the streamable representation by traversing the navigable representation, and using the pointer to the certain node in the streamable representation to access the node in the streamable representation.
-
32. The one or more non-transitory storage media of claim 31,
wherein evaluating a path expression further comprises, after accessing the certain node in the streamable representation, retrieving a pointer from the streamable representation to another node in the navigable representation, and using said another node in the navigable representation to further traverse the navigable representation. -
33. The one or more non-transitory storage media of claim 20,
wherein the navigable representation contains, for one or more nodes, a pointer to a node in the streamable representation; - and
wherein the streamable representation contains, for one or more nodes, a pointer to a node in the navigable representation.
- and
-
34. The one or more non-transitory storage media of claim 20, wherein there is at least one node in the XML document that is contained in the streamable representation and is not contained in the navigable representation.
-
35. The one or more non-transitory storage media of claim 20, wherein a node contained in the navigable representation does not contain any pointers to nodes in the navigable representation.
-
36. The one or more non-transitory storage media of claim 35, wherein said node contains a pointer to a node in the streamable representation.
-
37. The one or more non-transitory storage media of claim 20, wherein said each particular node satisfies a criterion inclusion in the navigable representation.
-
38. The one or more non-transitory storage media of claim 37, wherein the criterion is based on one or more of the following:
-
a common query access pattern for the plurality of documents, characteristics of content of said each particular node, characteristics of a sub-tree rooted at said each particular node, or system resources of a storage system of the persistent repository on which the streamable representation is persistently stored.
-
Specification