Method and apparatus for analysis of electronic communications containing imagery
First Claim
1. A method for categorizing an electronic communication containing imagery, the method comprising the steps of:
- locating portions of said imagery having text regions therein; and
analyzing said text regions to determine whether content of said text regions indicates that said electronic communication is likely to be unsolicited or unauthorized.
1 Assignment
0 Petitions
Accused Products
Abstract
A method and apparatus are provided for analyzing an electronic communication containing imagery, e.g., to determine whether or not the electronic communication is a spam communication. In one embodiment, an inventive method includes detecting one or more regions of imagery in a received electronic communication and applying pre-processing techniques to locate regions (e.g., blocks or lines) of text in the imagery that may be distorted. The method then analyzes the regions of text to determine whether the content of the text indicates that the electronic communication is spam. In one embodiment, specialized extraction and rectification of embedded text followed by optical character recognition processing is applied to the regions of text to extract their content therefrom. In another embodiment, keyword recognition or shape-matching processing is applied to detect the presence or absence of spam-indicative words from the regions of text. In another embodiment, other attributes of extracted text regions, such as size, location, color and complexity are used to build evidence for or against the presence of spam.
160 Citations
44 Claims
-
1. A method for categorizing an electronic communication containing imagery, the method comprising the steps of:
-
locating portions of said imagery having text regions therein; and
analyzing said text regions to determine whether content of said text regions indicates that said electronic communication is likely to be unsolicited or unauthorized. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
-
-
15. A computer readable medium containing an executable program for categorizing an electronic communication containing imagery, where the program performs the steps of:
-
locating portions of said imagery having text regions therein; and
analyzing said text regions to determine whether content of said text regions indicates that said electronic communication is likely to be unsolicited or unauthorized. - View Dependent Claims (16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28)
-
-
29. Apparatus for categorizing an electronic communication containing imagery, the apparatus comprising:
-
means for locating portions of said imagery having text regions therein; and
means for analyzing said text regions to determine whether content of said text regions indicates that said electronic communication is unsolicited and/or unauthorized.
-
-
30. A method for categorizing an electronic communication containing imagery, the method comprising the steps of:
-
applying pre-processing techniques to said imagery in order to locate regions of text in said imagery;
measuring one or more characteristics of sets of image pixels within said regions of text; and
determining if one or more measured characteristics indicates that said electronic communication is likely to be unsolicited or unauthorized. - View Dependent Claims (31, 32, 33, 34, 35, 36)
-
-
37. A computer readable medium containing an executable program for categorizing an electronic communication containing imagery, where the program performs the steps of:
-
applying pre-processing techniques to said imagery in order to locate regions of text in said imagery;
measuring one or more characteristics of sets of image pixels within said regions of text; and
determining if one or more measured characteristics indicates that said electronic communication is likely to be unsolicited or unauthorized. - View Dependent Claims (38, 39, 40, 41, 42, 43)
-
-
44. Apparatus for categorizing an electronic communication containing imagery, the apparatus comprising:
-
means for applying pre-processing techniques to said imagery in order to locate regions of text in said imagery;
means for measuring one or more characteristics of sets of image pixels within said regions of text; and
means for determining if one or more measured characteristics indicates that said electronic communication is likely to be unsolicited or unauthorized.
-
Specification