×

Apparatus and method for extracting and manipulating the reading order of text to prepare a display document for analysis

  • US 9,658,989 B2
  • Filed: 08/25/2008
  • Issued: 05/23/2017
  • Est. Priority Date: 09/03/2007
  • Status: Active Grant
First Claim
Patent Images

1. A method for preparing a display document for analysis comprising:

  • extracting character data from said display document, wherein a language of said character data in said display document is unknown when said character data is extracted;

    determining a first order associated with processing of said character data and a second order associated with a logical order of said character data, including comparing said character data against a set of dictionaries to determine said second order based on a match between said character data and a word listed in a dictionary of said set of dictionaries, each dictionary corresponding to a particular language and listing words of that language, wherein comparing said character data against a set of dictionaries further comprises, if a first comparison of said character data to said dictionaries does not determine a language of said character data, reversing an order of said character data and making a second comparison of said reversed character data against said set of dictionaries;

    determining whether said first order is different from said second order; and

    reversing at least a portion of said character data in response to said determination that said first order is different from said second order.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×