×

Parallel Processing of ETL Jobs Involving Extensible Markup Language Documents

  • US 20110072319A1
  • Filed: 09/24/2009
  • Published: 03/24/2011
  • Est. Priority Date: 09/24/2009
  • Status: Active Grant
First Claim
Patent Images

1. A method for running an Extract Transform Load (ETL) job in parallel on one or more processors wherein the ETL job comprises use of an extensible markup language (XML) document, wherein the method comprises:

  • receiving an XML document input;

    identifying a node in the XML document at which partitioning of the XML document is to begin;

    sending partition information to each respective processor;

    to performing a shallow parsing of the XML document in parallel on the one or more processors, wherein each processor performs shallow parsing using the identified partition node until it reaches its identified partition;

    using the shallow parsing to generate the partition of the input XML document, wherein each processor generates a different partition of the same XML document; and

    sending each partition in streaming format to an ETL job instance.

View all claims
  • 6 Assignments
Timeline View
Assignment View
    ×
    ×