×

Document division method and system

  • US 9,390,077 B2
  • Filed: 02/10/2012
  • Issued: 07/12/2016
  • Est. Priority Date: 09/30/2005
  • Status: Active Grant
First Claim
Patent Images

1. One or more tangible non-transitory computer-readable media storing instructions that, when executed by a processor, perform operations comprising:

  • receiving a first electronic document;

    determining an entropy value for the first electronic document;

    determining a first information gain value associated with a first line that divides the first electronic document into a first portion and a second portion, comprising;

    a) determining an entropy value for the first portion of the first electronic document and an entropy value for the second portion of the first electronic document,b) based on the entropy value for the first portion of the first electronic document and the entropy value for the second portion of the first electronic document, determining an entropy value associated with the first line, andc) determining the first information gain value by determining a difference between i) the entropy value for the first electronic document and ii) the entropy value associated with the first line;

    determining a second information gain value associated with a second line that divides the first electronic document into a third portion and a fourth portion, comprising;

    a) determining an entropy value for the third portion of the first electronic document and an entropy value for the fourth portion of the first electronic document,b) based on the entropy value for the third portion of the first electronic document and the entropy value for the fourth portion of the first electronic document, determining an entropy value associated with the second line, andc) determining the second information gain value by determining a difference between i) the entropy value for the first electronic document and ii) the entropy value associated with the second line;

    determining which of the first information gain value and second information gain value is greater;

    in response to determining that the first information gain value is greater, generating a second electronic document that includes at least a portion defined by the first line and using the first information gain value to recursively divide the portions defined by the first line;

    in response to determining that the second information gain value is greater, generating a third electronic document that includes at least a portion defined by the second line and using the second information gain value to recursively divide the portions defined by the second line,wherein the entropy value for the first portion of the first electronic document and the entropy value for the second portion of the first electronic document are based at least on a variation in pixel intensity for pixels that the first line intersects in the first electronic document, and the entropy value for the third portion of the first electronic document and the entropy value for the fourth portion of the first electronic document are based at least on a variation in pixel intensity for pixels that the second line intersects in the first electronic document.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×