×

SYSTEMS AND METHODS FOR HANDLING AND DISTINGUISHING BINARIZED, BACKGROUND ARTIFACTS IN THE VICINITY OF DOCUMENT TEXT AND IMAGE FEATURES INDICATIVE OF A DOCUMENT CATEGORY

  • US 20090119296A1
  • Filed: 11/06/2008
  • Published: 05/07/2009
  • Est. Priority Date: 11/06/2007
  • Status: Active Grant
First Claim
Patent Images

1. In a document analysis system that receives and processes jobs from a plurality of users and that automatically recognizes and classifies job documents into document categories, so that a job may be organized according to the document categories it contains, and in which each received document is a binarized, one-bit-per-document-pixel image version of an original grayscale or color image source document, a method of enhancing the received electronic documents to improve automatic recognition and classification of the received documents, the method comprising:

  • for each page of a received document, filtering the page to infer binarized-background artifacts resulting from the binarization of the original grayscale or color image source document and which reside in the vicinity of binarized text and binarized image features in the page, so that the binarized text and binarized images may be distinguished from the binarized-background artifacts and extracted from the document;

    using the extracted features from the filtered document to automatically recognized and classify a document into a document category.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×