Efficient storage and retrieval of XML data
First Claim
1. A method comprising:
- receiving, by a computing server comprising at least one programmable processor and from a computing device connected to the computing server, an extensible markup language document and a location path pointing to a unique path identifier uniquely identifying the extensible markup language document, the extensible markup language document and the location path being received separately, the extensible markup language document characterizing a tree having a root node and a plurality of child nodes;
generating, by the computing server, an in-memory first table that maps data within the extensible markup language document to one or more location paths, at least one location path of the one or more location paths being formed by combining the root node and a corresponding child node of the plurality of child nodes;
generating, by the computing server and by rearranging data within the in-memory first table, a second table comprising data arranged according to one or more location paths that are indicated by the extensible markup language document, the at least one location path of the one or more location paths pointing to a storage location for data listed under the location path in the table, the second table being specific to the unique path identifier, wherein generating the second table comprises;
retrieving the one or more location paths in the extensible markup language document and extensible markup language values associated with the possible location paths,determining, by the computing server, the unique path identifier stored within the first table, andserially processing, by the computing server, all entries in the first table, wherein the serially processing comprises;
receiving an entry from the first table, the entry comprising a location path and the unique path identifier,determining that the entry does not exist in the second table,creating, in response to the determining and from the first table, a location path and value pair for the unique path identifier, the location path and value pair comprising a location path and a corresponding value specific to the unique path identifier,obtaining, in response to the determining indicating that the entry does exist in the second table, values associated with the location path and the unique path identifier,determining, by the computer server, whether an entry exists in the first table having any location path of the one or more location paths and the unique path identifier,creating, in response to determining the entry does not exist in the first table, a location path and value pair for the unique path identifier,obtaining, in response to the determining indicating the entry does exist in the first table, a list of values associated with any location path of the one or more location paths and the unique path identifier,determining, by the computer server, whether data stored in the in-memory data store exists in the obtained list of values,adding, in response to the determining indicating the data does not exist in the obtained list of values, data indicated in a search query to the obtained list of values andupdating, in response to the adding, the obtained list of values associated with any location path of the one or more like location paths inside the second table, anddetermining, by the computer server, whether all entries of the first table have been processed; and
storing, by the computing server and in response to determining that all entries of the first table have been processed, the second table in a data store connected to the computing server, the stored second table being searchable in a time that is independent of a total number of extensible markup language documents stored in the data store;
receiving, by the computing server and from the computing device, the search query, the search query including a specific location path of the one or more location paths and the unique path identifier;
searching, by the computing server, the second table in the data store for a data associated with the specific location path and the unique path identifier specified in the search query, the time taken to retrieve the searched data being independent of the total number of extensible markup language documents stored in the data store; and
sending, by the computing server, the searched data associated with the specific location path and the unique path identifier specified in the search query to the computing device.
1 Assignment
0 Petitions
Accused Products
Abstract
A computing server can receive, from a computing device, an extensible markup language document and a location path pointing to an identifier uniquely identifying the extensible markup language document. The computing server can rearrange data within the extensible markup language document to generate a table including data arranged according to one or more location paths indicated by the extensible markup language document. Each location path of the one or more location paths can point to a storage location for data listed under the location path. The table can be specific to the identifier uniquely identifying the extensible markup language document. The computing server can store the table in a data store connected to the computing server. The computing server can retrieve, when required, the stored data from the data store within a time independent of a total number of XML documents in the data store.
36 Citations
20 Claims
-
1. A method comprising:
-
receiving, by a computing server comprising at least one programmable processor and from a computing device connected to the computing server, an extensible markup language document and a location path pointing to a unique path identifier uniquely identifying the extensible markup language document, the extensible markup language document and the location path being received separately, the extensible markup language document characterizing a tree having a root node and a plurality of child nodes; generating, by the computing server, an in-memory first table that maps data within the extensible markup language document to one or more location paths, at least one location path of the one or more location paths being formed by combining the root node and a corresponding child node of the plurality of child nodes; generating, by the computing server and by rearranging data within the in-memory first table, a second table comprising data arranged according to one or more location paths that are indicated by the extensible markup language document, the at least one location path of the one or more location paths pointing to a storage location for data listed under the location path in the table, the second table being specific to the unique path identifier, wherein generating the second table comprises; retrieving the one or more location paths in the extensible markup language document and extensible markup language values associated with the possible location paths, determining, by the computing server, the unique path identifier stored within the first table, and serially processing, by the computing server, all entries in the first table, wherein the serially processing comprises; receiving an entry from the first table, the entry comprising a location path and the unique path identifier, determining that the entry does not exist in the second table, creating, in response to the determining and from the first table, a location path and value pair for the unique path identifier, the location path and value pair comprising a location path and a corresponding value specific to the unique path identifier, obtaining, in response to the determining indicating that the entry does exist in the second table, values associated with the location path and the unique path identifier, determining, by the computer server, whether an entry exists in the first table having any location path of the one or more location paths and the unique path identifier, creating, in response to determining the entry does not exist in the first table, a location path and value pair for the unique path identifier, obtaining, in response to the determining indicating the entry does exist in the first table, a list of values associated with any location path of the one or more location paths and the unique path identifier, determining, by the computer server, whether data stored in the in-memory data store exists in the obtained list of values, adding, in response to the determining indicating the data does not exist in the obtained list of values, data indicated in a search query to the obtained list of values and updating, in response to the adding, the obtained list of values associated with any location path of the one or more like location paths inside the second table, and determining, by the computer server, whether all entries of the first table have been processed; and storing, by the computing server and in response to determining that all entries of the first table have been processed, the second table in a data store connected to the computing server, the stored second table being searchable in a time that is independent of a total number of extensible markup language documents stored in the data store; receiving, by the computing server and from the computing device, the search query, the search query including a specific location path of the one or more location paths and the unique path identifier; searching, by the computing server, the second table in the data store for a data associated with the specific location path and the unique path identifier specified in the search query, the time taken to retrieve the searched data being independent of the total number of extensible markup language documents stored in the data store; and sending, by the computing server, the searched data associated with the specific location path and the unique path identifier specified in the search query to the computing device. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A system comprising:
-
a computing device having an extensible markup language document and a location path pointing to a unique path identifier uniquely identifying the extensible markup language document; a computing server operatively coupled to the computing device, the computing server comprising at least one programmable processor configured to; separately receive the extensible markup language document and the location path from the computing device, generate, by rearranging data within the extensible markup language document, a second table comprising data arranged according to one or more location paths that are indicated by the extensible markup language document, each location path of the one or more location paths pointing to a storage location for data listed under the location path, the second table being specific to the unique path identifier, wherein generating the second table comprises; retrieving the one or more location paths in the extensible markup language document and all extensible markup language values associated with the possible location paths, determining, by the computing server, the unique path identifier stored within the first table, and serially processing, by the computing server, all entries in the first table, wherein the serially processing comprises; receiving an entry from the first table, the entry comprising a location path and the unique path identifier, creating, in response to the receiving and from the first table, a location path and value pair for the unique path identifier, the location path and value pair comprising a location path and a corresponding value specific to the unique path identifier, obtaining, in response to the creating, values associated with the location path and the unique path identifier, creating, in response to determining the entry does not exist in the first table, a location path and value pair for the unique path identifier, obtaining, in response to an indication the entry does exist in the first table, a list of values associated with any location path of the one or more location paths and the unique path identifier, adding, in response to an indication the entry does not exist in the obtained list of values, data indicated in a search query to the obtained list of values and updating, in response to the adding, the obtained list of values associated with any location path of the one or more like location paths inside the second table, and determining, by the computer server, whether all entries of the first table have been processed; and store, by the computing server and in response to determining that all entries of the first table have been processed, the second table in a data store connected to the computing server, the stored second table being searchable in a time that is independent of a total number of extensible markup language documents stored in the data store; receiving, by the computing server and from the computing device, the search query, the search query including a specific location path of the one or more location paths and the unique path identifier; searching, by the computing server, the second table in the data store for a data associated with the specific location path and the unique path identifier specified in the search query, the time taken to retrieve the searched data being independent of the total number of extensible markup language documents stored in the data store; and sending, by the computing server, the searched data associated with the specific location path and the unique path identifier specified in the search query to the computing device. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A non-transitory computer program product storing instructions that, when executed by at least one programmable processor, cause the at least one programmable processor to perform operations comprising:
-
receiving an extensible markup language document and a location path pointing to a unique path identifier uniquely identifying the extensible markup language document, the location path being external to the extensible markup language document during the receiving of the extensible markup language document and the location path, the extensible markup language document characterizing a tree having a root node and a plurality of child nodes; generating an in-memory first table that maps data within the extensible markup language document to one or more location paths, each location path of the one or more location paths being formed by combining the root node and a corresponding child node of the plurality of child nodes, each location path of the one or more location paths pointing to a storage location for data listed under the location path; generating, by modifying at least one of order and position of data within the in-memory first table, a second table comprising data arranged according to one or more location paths that are indicated by the extensible markup language document, the second table being specific to the unique path identifier, wherein generating the second table comprises; retrieving the one or more location paths in the extensible markup language document and all extensible markup language values associated with the possible location paths, determining, by the computing server, the unique path identifier stored within the first table, and serially processing, by the computing server, all entries in the first table, wherein the serially processing comprises; receiving an entry from the first table, the entry comprising a location path and the unique path identifier, determining that the entry does not exist in the second table, creating, in response to the determining and from the first table, a location path and value pair for the unique path identifier, the location path and value pair comprising a location path and a corresponding value specific to the unique path identifier, obtaining, in response to the determining indicating that the entry does exist in the second table, values associated with the location path and the unique path identifier, determining, by the computer server, whether an entry exists in the first table having any location path of the one or more location paths and the unique path identifier, creating, in response to determining the entry does not exist in the first table, a location path and value pair for the unique path identifier, obtaining, in response to the determining indicating the entry does exist in the first table, a list of values associated with any location path of the one or more location paths and the unique path identifier, determining, by the computer server, whether data stored in the in-memory data store exists in the obtained list of values, adding, in response to the determining indicating the data does not exist in the obtained list of values, data indicated in a search query to the obtained list of values and updating, in response to the adding, the obtained list of values associated with any location path of the one or more like location paths inside the second table, and determining, by the computer server, whether all entries of the first table have been processed; and storing, by the computing server and in response to determining that all entries of the first table have been processed, the second table in a data store connected to the computing server, the stored second table being searchable in a time that is independent of a total number of extensible markup language documents stored in the data store; receiving, by the computing server and from the computing device, the search query, the search query including a specific location path of the one or more location paths and the unique path identifier; searching, by the computing server, the second table in the data store for a data associated with the specific location path and the unique path identifier specified in the search query, the time taken to retrieve the searched data being independent of the total number of extensible markup language documents stored in the data store; and sending, by the computing server, the searched data associated with the specific location path and the unique path identifier specified in the search query to the computing device. - View Dependent Claims (20)
-
Specification