×

SYSTEMS AND METHODS FOR CLASSIFYING ELECTRONIC DOCUMENTS BY EXTRACTING AND RECOGNIZING TEXT AND IMAGE FEATURES INDICATIVE OF DOCUMENT CATEGORIES

  • US 20090116757A1
  • Filed: 11/06/2008
  • Published: 05/07/2009
  • Est. Priority Date: 11/06/2007
  • Status: Abandoned Application
First Claim
Patent Images

1. In a document analysis system that receives and processes jobs, a method of automatically recognizing and classifying each document in a job into a corresponding document category by automatically recognizing image and text features in the document so that each job may be automatically organized according to the categories of documents it contains, the method comprising:

  • automatically extracting from each received document image and text features, in which the image features are indicative of how the document is laid out or textually-organized and therefore indicative of a corresponding document category, and the text features are distinctive words that are indicative of a corresponding document category;

    comparing the extracted image and text features with feature sets associated with each category of document, in which each feature set includes a subset of text features and corresponding weights and a subset of image features and corresponding weights;

    classifying each document to a document category, the feature set of which best matches the extracted features of said document; and

    organizing each job according to the categories of documents it contains.

View all claims
  • 3 Assignments
Timeline View
Assignment View
    ×
    ×