Mapping of data from XML to SQL
First Claim
1. A method for mapping a dataset at least partially encoded with a markup language into one or more relational database tables, the method comprising:
- identifying at least one hierarchical structure associated with said encoded dataset;
determining a node element set for said hierarchical structure, wherein said node element set comprises one or more nodes that are one or more levels of said hierarchical structure;
identifying at least one node of said encoded dataset, which node is part of said node element set;
allocating to said node of said encoded dataset a unique node identifier; and
generating a relational database table containing one or more records, said one or more records corresponding to a respective one or more of said allocated node identifiers;
wherein said encoded dataset includes one or more predefined portions of text-based data, said one or more predefined portions being at least partially encoded using a markup language, and being associated with one or more of a plurality of attributes for organizing said one or more predefined portions;
wherein said one or more predefined portions include at least one modified and stored predefined portion, said at least one modified predefined portion being associated with one or more attributes for organizing said one or more predefined portions and said modified predefined portion.
0 Assignments
0 Petitions
Accused Products
Abstract
A method, an apparatus and a computer program product for converting an XML encoded dataset into a minimal set of SQL tables and provided. In the method, a hierarchical structure in the XML encoded dataset is identified. A node element set for the XML encoded dataset is determined, wherein each node element in the node element set is a discrete level of the hierarchical structure of the dataset. One or more nodes of the XML encoded dataset are determined, each node being an instance of a node element. A unique node identifier is allocated to each node. Then, an SQL node table containing one or more records is generated, each record corresponding to a respective one of the allocated node identifiers. An SQL ancestry table is optionally generated to define the inter-relationships among nodes of the identified hierarchical structure of the XML encoded dataset.
103 Citations
22 Claims
-
1. A method for mapping a dataset at least partially encoded with a markup language into one or more relational database tables, the method comprising:
-
identifying at least one hierarchical structure associated with said encoded dataset; determining a node element set for said hierarchical structure, wherein said node element set comprises one or more nodes that are one or more levels of said hierarchical structure; identifying at least one node of said encoded dataset, which node is part of said node element set; allocating to said node of said encoded dataset a unique node identifier; and generating a relational database table containing one or more records, said one or more records corresponding to a respective one or more of said allocated node identifiers; wherein said encoded dataset includes one or more predefined portions of text-based data, said one or more predefined portions being at least partially encoded using a markup language, and being associated with one or more of a plurality of attributes for organizing said one or more predefined portions; wherein said one or more predefined portions include at least one modified and stored predefined portion, said at least one modified predefined portion being associated with one or more attributes for organizing said one or more predefined portions and said modified predefined portion.
-
-
2. The method according to claim 1, wherein said encoded dataset further comprises at least one link of a markup language, at least one of said predefined portions of said text-based data and said at least one modified predefined portion being encoded with at least one link.
-
3. The method according to claim 1, wherein said record of said node table contains said node identifier.
-
4. The method according to claim 3, wherein said record of said node table further comprises a field for a property of the corresponding node.
-
5. The method according to claim 1, wherein said node includes at least one of a property, sub-node, initial content, and block content.
-
6. The method according to claim 1, further comprising:
converting the encoded dataset into one or more pre-processed content fields.
-
7. The method according to claim 6, further comprising:
storing said one or more pre-processed content fields in one or more content tables, said record in said content table having a node identifier field containing the corresponding node identifier, a block identifier field and a content field.
-
8. The method according to claim 1, further comprising:
generating an ancestry table to define the inter-relationships among said nodes of said identified hierarchical structure of said encoded dataset.
-
9. The method according to claim 8, wherein said ancestry table includes a descendant node identifier field and an ancestor node identifier field.
-
10. The method according to claim 1, further comprising:
generating a complex properties table having a node identifier field and a value field.
-
11. The method according to claim 10, wherein said complex properties table further includes a sub-property field.
-
12. A computer program product having a non-transitory computer readable medium, said non-transitory computer readable medium having a computer program recorded therein for mapping a dataset at least partially encoded with a markup language into one or more relational database tables, said computer program product comprising:
-
computer program code for identifying at least one hierarchical structure associated with said encoded dataset; computer program code for determining a node element set for said hierarchical structure, wherein said node element set comprises one or more nodes that are one or more levels of said hierarchical structure; computer program code for identifying at least one node of said encoded dataset, which node is part of said node element set; computer program code for allocating to said node of said encoded dataset a unique node identifier; and computer program code for generating a relational database table containing one or more records, said one or more records corresponding to a respective one or more of said allocated node identifiers; wherein said encoded dataset includes one or more predefined portions of text-based data, said one or more predefined portions being at least partially encoded using a markup language, and being associated with one or more of a plurality of attributes for organizing said one or more predefined portions; wherein said one or more predefined portions include at least one modified and stored predefined portion, said at least one modified predefined portion being associated with one or more attributes for organizing said one or more predefined portions and said modified predefined portion.
-
-
13. The computer program product according to claim 12, wherein said encoded dataset further comprises at least one link of a markup language, at least one of said predefined portions of said text-based data and said at least one modified predefined portion being encoded with at least one link.
-
14. The computer program product according to claim 12, wherein said record of said node table contains said node identifier.
-
15. The computer program product according to claim 14, wherein said record of said node table further comprises a field for a property of the corresponding node.
-
16. The computer program product according to claim 12, wherein said node includes at least one of a property, sub-node, initial content, and block content.
-
17. The computer program product according to claim 12, further comprising:
computer program code for converting the encoded dataset into one or more pre-processed content fields.
-
18. The computer program product according to claim 17, further comprising:
computer program code for storing said one or more pre-processed content fields in one or more content tables, said record in said content table having a node identifier field containing the corresponding node identifier, a block identifier field and a content field.
-
19. The computer program product according to claim 12, further comprising:
computer program code for generating an ancestry table to define the inter-relationships among said nodes of said identified hierarchical structure of said encoded dataset.
-
20. The computer program product according to claim 19, wherein said ancestry table includes a descendant node identifier field and an ancestor node identifier field.
-
21. The computer program product according to claim 12, further comprising:
computer program code for generating a complex properties table having a node identifier field and a value field.
-
22. The computer program product according to claim 21, wherein said complex properties table further includes a sub-property field.
Specification