EVENT-LEVEL PARALLEL METHODS AND APPARATUS FOR XML PARSING
First Claim
1. A computer-implemented method for parsing XML data, the method comprising:
- partitioning, by a computing device, the XML data into a plurality of XML chunks by a computing device;
parsing by the computing device, respective ones of the plurality of chunks in parallel into sub-event streams; and
generating by the computing device, a result event stream for the XML data from the sub-event streams.
1 Assignment
0 Petitions
Accused Products
Abstract
Embodiments of techniques and systems for parallel XML parsing are described. An event-level XML parser may include a lightweight events partitioning stage, parallel events parsing stages, and a post-processing stage. The events partition may pick out event boundaries using single-instruction, multiple-data instructions to find occurrences of the “<” character, marking event boundaries. Subsequent checking may be performed to help identify other event boundaries, as well as non-boundary instances of the “<” character. During events parsing, unresolved items, such as namespace resolution or matching of start and end elements, may be recorded in structure metadata. This structure metadata may be used during the subsequent post-processing to perform a check of the XML data. If the XML data is well-formed, individual sub-event streams formed by the events parsing processes may be assembled into a flat result event stream structure. Other embodiments may be described and claimed.
-
Citations
20 Claims
-
1. A computer-implemented method for parsing XML data, the method comprising:
-
partitioning, by a computing device, the XML data into a plurality of XML chunks by a computing device; parsing by the computing device, respective ones of the plurality of chunks in parallel into sub-event streams; and generating by the computing device, a result event stream for the XML data from the sub-event streams. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A system comprising:
-
storage memory configured to store an XML document; and a computer processor having a plurality of processing cores, coupled to the storage memory; an events partitioning module, executable by a first processor core and configured, upon execution by the at least one processor core to partition the XML document into a plurality of XML chunks; one or more events parsing modules, executable in parallel by one or more respective processor cores and configured, upon execution by the one or more respective processor cores, to perform, in parallel, events parsing on respective XML chunks to produce respective sub-event streams and structure metadata; and a post-processing module, executable by a second processor core, to perform post-processing on the sub-event streams based on the structure metadata to produce a result event stream. - View Dependent Claims (15, 16)
-
-
17. One or more computer-readable storage media containing instructions which, upon execution by a computer processor comprising a plurality of cores, cause the computer processor to perform a method comprising:
-
partitioning the XML document into a plurality of XML chunks; on respective cores of the computer processor, performing events parsing on respective XML chunks to produce respective sub-event streams and structure metadata; and performing post-processing on the sub-event streams based on the structure metadata to produce a result event stream. - View Dependent Claims (18, 19, 20)
-
Specification