Index Structure for Supporting Structural XML Queries
First Claim
1. A method of generating a virtual suffix tree (ViST) structure for searching XML documents, comprising:
- receiving one or more XML documents;
converting the one or more XML documents into one or more structure-encoded sequences; and
generating the ViST structure comprising;
generating a D-Ancestor index;
generating an S-Ancestor index; and
generating a doc-ID index.
0 Assignments
0 Petitions
Accused Products
Abstract
The present invention provides a ViST (or “virtual suffix tree”), which is a novel index structure for searching XML documents. By representing both XML documents and XML queries in structure-encoded sequences, it is shown that querying XML data is equivalent to finding (non-contiguous) subsequence matches. A variety of XML queries, including those with branches, or wild-cards (‘*’ and ‘//’), can be expressed by structure-encoded sequences. Unlike index methods that disassemble a query into multiple sub-queries, and then join the results of these sub-queries to provide the final answers, ViST uses tree structures as the basic unit of query to avoid expensive join operations. Furthermore, ViST provides a unified index on both content and structure of the XML documents, hence it has a performance advantage over methods indexing either just content or structure. ViST supports dynamic index update, and it relies solely on B+Trees without using any specialized data structures that are not well supported by common database management systems (hereinafter referred to as “DBMSs”).
-
Citations
13 Claims
-
1. A method of generating a virtual suffix tree (ViST) structure for searching XML documents, comprising:
-
receiving one or more XML documents;
converting the one or more XML documents into one or more structure-encoded sequences; and
generating the ViST structure comprising;
generating a D-Ancestor index;
generating an S-Ancestor index; and
generating a doc-ID index. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A method of answering an XML query, comprising:
-
receiving an XML query;
transforming the XML query into a structure-encoded sequence;
searching a virtual suffix tree (ViST) structure using the structure-encoded sequence and returning one or more document IDs. - View Dependent Claims (7)
-
-
8. A method of dynamically updating a virtual suffix tree (ViST) structure, comprising
receiving a new XML document; -
transforming the XML document into a structure-encoded sequence;
inserting each element of the sequence into D-Ancestor B+Tree;
assigning a new label if the step of inserting creates a new node; and
inserting the new label into the S-Ancestor B+Tree. - View Dependent Claims (9, 10)
-
-
11. A machine-readable medium having instructions stored thereon for execution by a processor to perform a method of generating a virtual suffix tree (ViST) structure for searching XML documents, comprising the steps of:
-
receiving one or more XML documents;
converting the one or more XML documents into one or more structure-encoded sequences;
generating the ViST structure comprising;
generating a D-Ancestor index;
generating an S-Ancestor index; and
generating a doc-ID index.
-
-
12. A machine-readable medium having instructions stored thereon for execution by a processor to perform a method answering an XML query, comprising the steps of:
-
receiving an XML query;
transforming the XML query into a structure-encoded sequence;
searching a virtual suffix tree (ViST) structure using the structure-encoded sequence and returning one or more document IDs.
-
-
13. A machine-readable medium having instructions stored thereon for execution by a processor to perform a method of dynamically updating the virtual suffix tree (ViST) structure, comprising the steps of:
-
receiving a new XML document transforming the XML document into a structure-encoded sequence inserting each element of the sequence into D-Ancestor B+Tree;
assigning a new label if the step of inserting creates a new node; and
inserting the new label into the S-Ancestor B+Tree.
-
Specification