Pure adversarial approach for identifying text content in images
First Claim
1. A computer-implemented method of identifying text content in images, the method comprising:
- receiving an input image;
splitting the image into a plurality of blocks, each block in the plurality of blocks containing pixel information that may represent one or more characters;
forming a candidate sequence of blocks from the plurality of blocks, the candidate sequence of blocks representing a candidate match for a search term; and
determining if the search term is present in the candidate sequence of blocks.
2 Assignments
0 Petitions
Accused Products
Abstract
A pure adversarial optical character recognition (OCR) approach in identifying text content in images. An image and a search term are input to a pure adversarial OCR module, which searches the image for presence of the search term. The image may be extracted from an email by an email processing engine. The OCR module may split the image into several character-blocks that each has a reasonable probability of containing a character (e.g., an ASCII character). The OCR module may form a sequence of blocks that represent a candidate match to the search term and calculate the similarity of the candidate sequence to the search term. The OCR module may be configured to output whether or not the search term is found in the image and, if applicable, the location of the search term in the image.
241 Citations
20 Claims
-
1. A computer-implemented method of identifying text content in images, the method comprising:
-
receiving an input image; splitting the image into a plurality of blocks, each block in the plurality of blocks containing pixel information that may represent one or more characters; forming a candidate sequence of blocks from the plurality of blocks, the candidate sequence of blocks representing a candidate match for a search term; and determining if the search term is present in the candidate sequence of blocks. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A computer having a memory and a processor configured to execute computer-readable program code in the memory, the memory comprising:
-
an email processing engine configured to receive an email and extract an image from the email; and a pure adversarial optical character recognition (OCR) module configured to receive a search term and the image and to search the image for the search term. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A computer-implemented method of identifying text content in images, the method comprising:
-
extracting an image from an email; and searching the image for presence of a search term. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification