×

Efficient data structures for parsing and analyzing a document

  • US 8,438,472 B2
  • Filed: 06/07/2009
  • Issued: 05/07/2013
  • Est. Priority Date: 01/02/2009
  • Status: Active Grant
First Claim
Patent Images

1. A non-transitory machine readable medium storing a program for execution by at least one processing unit, the program comprising sets of instructions for:

  • parsing an unstructured document comprising a plurality of primitive elements, wherein the plurality of primitive elements comprises a plurality of glyphs;

    storing the primitive elements in a random order in a first array;

    storing, in a second array, references to the stored primitive elements, the references ordered in the second array based on locations of the primitive elements in the document, wherein each of the references refers to a single primitive element in the first array;

    receiving instructions to perform a document reconstruction operation that associates a portion of the primitive elements into a structural element in order to generate a structured document from the unstructured document; and

    performing the received instructions to define the structural element using the references stored in the second array without storing any new references to the primitive elements.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×