×

Method and system for classifying documents that have different scales

  • US 8,832,108 B1
  • Filed: 04/18/2013
  • Issued: 09/09/2014
  • Est. Priority Date: 03/28/2012
  • Status: Active Grant
First Claim
Patent Images

1. A system for classifying documents that have different scales, the system comprising:

  • one or more processors; and

    a non-transitory computer readable medium storing a plurality of instructions, which when executed, cause the one or more processors to;

    count instances for each character size in a first document and instances for each character size in a second document;

    select a first plurality of character sizes for the first document and a second plurality of character sizes for the second document, based on a corresponding count of instances associated with each corresponding character size;

    calculate a plurality of scales, wherein each scale of the plurality of scales is based on a corresponding ratio of a corresponding one of the first plurality of character sizes relative to a corresponding one of the second plurality of character sizes;

    calculate a plurality of scale products based on each corresponding count of instances for each character size range associated with the first plurality of character sizes multiplied by each corresponding count of instances for each corresponding character size range associated with the second plurality of character sizes, wherein the corresponding character size range is based on a corresponding one of the plurality of scales;

    calculate a plurality of scale scores based on summing each of the plurality of scale products associated with each corresponding one of the plurality of scales;

    select a scale of the plurality of scales based a highest one of the plurality of scale scores associated with a corresponding one the plurality of scales;

    determine whether the second document is in a class associated with the first document based on a comparison of location information associated with the first document and location information associated with the second document, wherein the location information associated with second document is based on the scale; and

    classify the second document in the class associated with the first document in response to a determination that the second document is in the class associated with the first document.

View all claims
  • 12 Assignments
Timeline View
Assignment View
    ×
    ×