×

SYSTEMS AND METHODS FOR PARALLEL PROCESSING OF DOCUMENT RECOGNITION AND CLASSIFICATION USING EXTRACTED IMAGE AND TEXT FEATURES

  • US 20090116746A1
  • Filed: 11/06/2008
  • Published: 05/07/2009
  • Est. Priority Date: 11/06/2007
  • Status: Abandoned Application
First Claim
Patent Images

1. In a document analysis system that receives jobs from a plurality of users and which automatically classifies documents to organize each job according to the categories of documents the job contains, a method of parallel processing each job comprising:

  • for each job, automatically separating the job into its constituent electronic documents;

    for each received electronic document, automatically separating the document into subsets of electronic pages;

    for each page of each subset, automatically extracting image features that are indicative of how the document is laid out or textually-organized and therefore indicative of a corresponding document category and automatically extracting text features it contains, in which feature extraction for each subset is done independently and in parallel of such automatic extraction for the other subsets of the document;

    for each subset, automatically comparing the extracted features with feature sets associated with each category of document to determine a comparison score for the subset;

    using the comparison score for each of the subsets to automatically classify the electronic document as being one of the categories of documents; and

    organizing the job according to the categories of documents the job contains.

View all claims
  • 3 Assignments
Timeline View
Assignment View
    ×
    ×