Cost-based storage of extensible markup language (XML) data
First Claim
1. A method of mapping extensible markup language (XML) data for storage in an alternative database management system (DBMS) comprising the steps of:
- generating a plurality of alternative ones of said mappings in response to a supplied XML document and corresponding XML schema;
evaluating at least a prescribed attribute of each of said plurality of mappings with respect to an expected workload for the storage system; and
selecting one of said alternative mappings based on said prescribed attribute which is the most advantageous for the expected system workload.
1 Assignment
0 Petitions
Accused Products
Abstract
Extensible Markup Language (XML) data is mapped to be stored in an alternative data base management system (DBMS) by generating a plurality of alternative ones of mappings in response to a supplied XML document and corresponding XML schema; evaluating at least a prescribed attribute of each of the plurality of mappings with respect to an expected workload for the storage system; and selecting one of the alternative mappings based on the prescribed attribute which is the most advantageous for the expected system workload. More specifically, applicants employ a unique process that utilizes a unique notion of physical XML Schemas, i.e., P-Schemas; a P-Schema costing procedure; a set of P-Schema rewritings; and a search strategy to heuristically determine the P-Schema with the least cost. Specifically, the unique notion of physical XML Schemas, extend XML Schemas to contain data statistics; a P-Schema can be easily and uniquely mapped into a storage configuration for the target DBMS. The P-Schema costing procedure estimates the cost of evaluating the query workload on the corresponding unique storage configuration. The set of P-Schema rewritings, when successively applied to a P-Schema, yields a space of alternative P-Schemas. These alternative P-Schemas have the property that any XML document that is valid for the initial P-Schema is also valid for any of these alternative P-Schemas. The search strategy examines this space of alternative P-Schemas to heuristically determine the P-Schema with the least cost. The storage configuration derived from this least cost P-Schema is the desired storage configuration to be used to store the XML data in the target DBMS.
-
Citations
19 Claims
-
1. A method of mapping extensible markup language (XML) data for storage in an alternative database management system (DBMS) comprising the steps of:
-
generating a plurality of alternative ones of said mappings in response to a supplied XML document and corresponding XML schema;
evaluating at least a prescribed attribute of each of said plurality of mappings with respect to an expected workload for the storage system; and
selecting one of said alternative mappings based on said prescribed attribute which is the most advantageous for the expected system workload. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A method of mapping extensible markup language (XML) data for storage in an alternative database management system (DBMS) comprising the steps of:
-
generating an initial physical-schema (P-Schema) from a supplied XML document and a corresponding XML schema;
transforming said initial P-Schema into alternative P-Schemas;
identifying each alternative storage configuration in said alternative DBMS with a unique one of said alternative P-Schemas;
translating each of the alternative P-Schemas into a storage configuration and related statistics for the alternative DBMS;
translating an XML query on the corresponding XML Schema into a query on the alternative DBMS based on the alternative DBMS storage configuration identified to the current alternative P-Schema;
selecting a most efficient alternative P-Schema corresponding to the most efficient alternative storage configuration for said alternative DBMS; and
utilizing said most efficient alternative P-Schema and its corresponding most efficient alternative storage configuration for said alternative DBMS to store XML document data in said alternative DBMS. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16, 17, 18, 19)
-
Specification