×

Shape clustering and cluster-level manual identification in post optical character recognition processing

  • US 7,697,758 B2
  • Filed: 09/11/2006
  • Issued: 04/13/2010
  • Est. Priority Date: 09/11/2006
  • Status: Expired due to Fees
First Claim
Patent Images

1. A computer-implemented method for processing output from an optical character recognition (OCR) process, comprising:

  • classifying separated images in an output of the OCR process generated from processing an original image of a document into a plurality of clusters of separated images, each cluster comprising separated images of similar image sizes and shapes that are assigned the same one or more particular characters by the OCR process;

    using a cluster image to represent separated images in a respective cluster;

    selecting a cluster which has a low level of confidence to obtain a manual assignment of one or more characters with the cluster image of the selected cluster; and

    using the one or more characters obtained by the manual assignment to verify or replace respective one or more particular characters previously assigned by the OCR process in the output of the OCR process,wherein the method is performed by one or more computer processors.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×