System and methods for extracting document images from images featuring multiple documents
First Claim
1. A method for extracting document images from images featuring multiple documents, comprising:
- receiving a multiple-document image including a plurality of document images, wherein each document image is associated with a document;
extracting a plurality of visual identifiers from the multiple-document image, wherein each visual identifier is text indicating information related to one of the plurality of document images;
analyzing the plurality of visual identifiers to identify each document image, wherein each document image is identified based on at least one threshold visual identifier requirement representing a portion of the plurality of visual identifiers that need to be included in each of the identified document image;
identifying, for each identified document image that meets the at least one threshold visual identifier requirement, a boundary based on the analysis, the boundary occupying a textless border around the respective identified document image and enclosing all of the plurality of visual identifiers that need to be included within the document image as represented by the at least one threshold visual identifier requirement;
determining, based on the analysis, an image area of each document image, wherein the image area of the document image is defined by the boundary; and
extracting each document image based on its image area, wherein extracting each document image further comprises generating a file including the document image.
4 Assignments
0 Petitions
Accused Products
Abstract
A system and method for extracting document images from images featuring multiple documents are presented. The method includes receiving a multiple-document image including a plurality of document images, wherein each document image is associated with a document; extracting a plurality of visual identifiers from the multiple-document image, wherein each visual identifier is associated with one of the plurality of document images; analyzing the plurality of visual identifiers to identify each document image; determining, based on the analysis, an image area of each document image; extracting each document image based on its image area.
-
Citations
17 Claims
-
1. A method for extracting document images from images featuring multiple documents, comprising:
-
receiving a multiple-document image including a plurality of document images, wherein each document image is associated with a document; extracting a plurality of visual identifiers from the multiple-document image, wherein each visual identifier is text indicating information related to one of the plurality of document images; analyzing the plurality of visual identifiers to identify each document image, wherein each document image is identified based on at least one threshold visual identifier requirement representing a portion of the plurality of visual identifiers that need to be included in each of the identified document image; identifying, for each identified document image that meets the at least one threshold visual identifier requirement, a boundary based on the analysis, the boundary occupying a textless border around the respective identified document image and enclosing all of the plurality of visual identifiers that need to be included within the document image as represented by the at least one threshold visual identifier requirement; determining, based on the analysis, an image area of each document image, wherein the image area of the document image is defined by the boundary; and extracting each document image based on its image area, wherein extracting each document image further comprises generating a file including the document image. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A system for extracting document images from images featuring multiple documents, comprising:
-
a processing circuitry; and a memory, the memory containing instructions that, when executed by the processing circuitry, configure the system to; receive a multiple-document image including a plurality of document images, wherein each document image is associated with a document; extract a plurality of visual identifiers from the multiple-document image, wherein each visual identifier is text indicating information related to one of the plurality of document images; analyze the plurality of visual identifiers to identify each document image, wherein each document image is identified based on at least one threshold visual identifier requirement representing a portion of the plurality of visual identifiers that need to be included in each of the identified document image; identify, for each identified document image that meets the at least one threshold visual identifier requirement, a boundary based on the analysis, the boundary occupying a textless border around the respective identified document image and enclosing all visual identifiers that need to be included within the document image as represented by the at least one threshold visual identifier requirement; determine, based on the analysis, an image area of each document image, wherein the image area of the document image is defined by the boundary; and extract each document image based on its image area, wherein extracting each document image further comprises generating a file including the document image. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17)
-
Specification