×

Method and system for disintegrating an XML document for high degree of parallelism

  • US 9,658,992 B2
  • Filed: 05/18/2011
  • Issued: 05/23/2017
  • Est. Priority Date: 05/24/2010
  • Status: Active Grant
First Claim
Patent Images

1. A method for processing an Extensible Markup Language (XML) file, the method comprising:

  • receiving, by a multicore processor, an Extensible Markup Language (XML) file comprising a plurality of Extensible Markup Language (XML) notations, wherein the plurality of XML notations are arranged in a plurality of lines, wherein each XML notation of the plurality of XML notations indicates an element of a record comprising information, and wherein each XML notation comprises tags at start and end of a line in the plurality of lines;

    preprocessing, by the multicore processor, the XML file only once to create an intermediate XML file by reading the plurality of XML notations line by line in order to reorganize the XML file in a manner that each tag ends in a line, wherein related tags are organized in a line in the intermediate XML file;

    converting, by the multicore processor, the intermediate XML file into a Simple Dependency Markup Language (SDML) file by reading start tags till end tags of the plurality of XML notations in the plurality of lines, wherein each line in SDML file starts with a start tag and ends with an end tag resulting in an aggregation of record in a single line;

    iteratively checking, by the multicore processor, if more lines are available in the intermediate XML file for conversion into the SDML file, wherein upon completion of the conversion of the intermediate XML file into the SDML file, the intermediate XML file is deleted;

    splitting, by the multicore processor, the SDML file into a plurality of SDML fragments such that the plurality of SDML fragments are distributed across cores of the multicore processor in order to parse the plurality of SDML fragments parallelly; and

    combining, by the multicore processor, resultants of the parsed plurality of SDML fragments to create a Previously presented Simple Dependency Markup Language (SDML) file.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×