Adversarial approach for identifying inappropriate text content in images
First Claim
1. A method of identifying inappropriate text content in images, the method to be performed using a computer and comprising:
- selecting an expression from a listing of expressions, the selected expression comprising a word or phrase indicative of spam;
extracting an image from a message;
performing optical character recognition (OCR) on the image to extract text from the image;
using the selected expression as a reference, finding in the extracted text an occurrence that is suitably similar to a beginning and an end of the selected expression in terms of shape;
scoring how well the selected expression matches the occurrence in the extracted text; and
determining if the selected expression matches text content in the image based on the scoring.
1 Assignment
0 Petitions
Accused Products
Abstract
An adversarial approach in detecting inappropriate text content in images. An expression from a listing of expressions may be selected. The listing of expressions may include words, phrases, or other textual content indicative of a particular type of message. Using the selected expression as a reference, the image is searched for a section that could be similar to the selected expression. The similarity between the selected expression and the section of the image may be in terms of shape. The section may be scored against the selected expression to determine how well the selected expression matches the section. The score may be used to determine whether or not the selected expression is present in the image.
23 Citations
20 Claims
-
1. A method of identifying inappropriate text content in images, the method to be performed using a computer and comprising:
-
selecting an expression from a listing of expressions, the selected expression comprising a word or phrase indicative of spam; extracting an image from a message; performing optical character recognition (OCR) on the image to extract text from the image; using the selected expression as a reference, finding in the extracted text an occurrence that is suitably similar to a beginning and an end of the selected expression in terms of shape; scoring how well the selected expression matches the occurrence in the extracted text; and determining if the selected expression matches text content in the image based on the scoring. - View Dependent Claims (2, 3)
-
-
4. A method of identifying inappropriate text content in images, the method to be performed using a computer and comprising:
-
selecting an expression from a listing of expressions; extracting an image from a message; using the selected expression as a reference, finding a section of the image that corresponds to a start point and an end point of the selected expression; comparing the section of the image to the selected expression to determine how well the section of the image matches the selected expression; and determining if the selected expression is present in the section of the image based on the comparison of the image to the selected expression. - View Dependent Claims (5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A computer having a memory and a processor configured to execute computer-readable program code in the memory, the memory comprising:
-
an OCR module comprising computer-readable program code configured to generate text from an image included in an email; an antispam engine comprising computer-readable program code configured to select an expression from a plurality of expressions and to determine if the selected expression is present in the text using the selected expression as a reference to search the text. - View Dependent Claims (15, 16, 17, 18, 19, 20)
-
Specification