×

Reordering text from unstructured sources to intended reading flow

  • US 9,658,990 B2
  • Filed: 09/18/2014
  • Issued: 05/23/2017
  • Est. Priority Date: 09/18/2014
  • Status: Active Grant
First Claim
Patent Images

1. An information handling system comprising:

  • one or more processors;

    a memory coupled to at least one of the processors; and

    a set of computer program instructions stored in the memory and executed by at least one of the processors in order to perform actions of;

    identifying a plurality of sections from a sequence of characters included in a Portable Document Format (PDF) source file, wherein each section includes a unique set of coordinate positions;

    building a plurality of directional links between the plurality of sections based on a relative position of each sections'"'"' coordinate positions in relation to other sections'"'"' coordinate positions along an axis; and

    repeatedly merging two or more sections to form increasingly larger sections, wherein the merged two or more sections are selected based on the directional links built between the two or more sections, wherein the repeatedly merging further comprises building one or more new directional links between the increasingly larger sections and one or more remaining sections selected from the plurality of sections, and wherein the repeatedly merging continues until the plurality of sections are exhausted and consolidated into a final larger section, wherein the final larger section is arranged in an intended reading order.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×