Automatic document separation
First Claim
Patent Images
1. In a computer-based system, a method of delineating document boundaries and identifying document types, comprising:
- automatically categorizing a plurality of document images into a plurality of predetermined categories in accordance with classification rules for said categories; and
automatically generating at least one identifier for identifying which of said plurality of document images belongs to which of said at least two categories.
10 Assignments
0 Petitions
Accused Products
Abstract
A method and system for delineating document boundaries and identifying document types by analyzing digital images of one or more documents, automatically categorizing one or more pages or subdocuments within the one or more documents and automatically generating delineation identifiers, such as computer-generated images of separation pages inserted between digital images belonging to different categories, a description of the categorization sequence of the digital images, or a computer-generated electronic label affixed or associated with said digital images.
-
Citations
38 Claims
-
1. In a computer-based system, a method of delineating document boundaries and identifying document types, comprising:
- automatically categorizing a plurality of document images into a plurality of predetermined categories in accordance with classification rules for said categories; and
automatically generating at least one identifier for identifying which of said plurality of document images belongs to which of said at least two categories. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20)
- automatically categorizing a plurality of document images into a plurality of predetermined categories in accordance with classification rules for said categories; and
-
21. In a computer-based system, a method of delineating document boundaries and identifying document types, comprising:
-
automatically categorizing a plurality of document images into a plurality of predetermined categories in accordance with classification rules for said categories, comprising; producing an output score for each document image; and using a graph search algorithm to determine an optimum categorization sequence from a plurality of possible categorization sequences for said plurality of document images based on said output scores, comprising; using a graph structure to calculate a total output score, based on said output scores for each of said plurality of document images, for each said possible categorization sequence; and determining which categorization sequence yields the highest total output score; and automatically generating at least one identifier for identifying which of said plurality of document images belongs to which of said at least two categories, wherein said graph structure is implemented using a finite state transducer wherein said plurality of document images comprise inputs and said plurality of categorization sequences comprise outputs. - View Dependent Claims (22)
-
-
23. A non-transitory computer-readable medium for storing computer executable instructions that when executed by a computer perform a method of delineating document boundaries and identifying document types, said method comprising:
- automatically categorizing a plurality of document images into a plurality of predetermined categories in accordance witch classification rules for said categories; and
automatically generating at least one identifier for identifying which of said plurality of document images belongs to which of said at least two categories. - View Dependent Claims (24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36)
- automatically categorizing a plurality of document images into a plurality of predetermined categories in accordance witch classification rules for said categories; and
-
37. A non-transitory computer-readable medium for storing computer executable instructions that when executed by a computer perform a method of delineating document boundaries and identifying document types, said method comprising:
-
automatically categorizing a plurality of document images into a plurality of predetermined categories in accordance with classification rules for said categories, comprising; producing an output score for each document image; and
using a graph search algorithm to determine an optimum categorization sequence from a plurality of possible categorization sequences for said plurality of document images based on said output scores, comprising;using a graph structure to calculate a total output score, based on said output scores for each of said plurality of document images, for each said possible categorization sequence; and determining which categorization sequence yields the highest total output score; and automatically generating at least one identifier for identifying which of said plurality of document images belongs to which of said at least two categories, wherein said graph structure is implemented using a finite state transducer wherein said plurality of document images comprise inputs and said plurality of categorization sequences comprise outputs. - View Dependent Claims (38)
-
Specification