Email analysis using fuzzy matching of text
First Claim
1. A method for analyzing character codes in text, the method comprising:
- parsing the character codes;
determining that a character code would create an undesirable message image when the character codes are displayed; and
processing the character code to produce translated text.
1 Assignment
0 Petitions
Accused Products
Abstract
Translation of text or messages provides a message that is more reliably or efficiently analyzed for purposes as, for example, to detect spam in email messages. One translation process takes into account statistics of erroneous and intentional misspellings. Another process identifies and removes characters or character codes that do not generate visible symbols in a message displayed to a user. Another process detects symbols such as periods, commas, dashes, etc., interspersed in text such that the symbols do not unduly interfere with, or prevent, a user from perceiving a spam message. Another process can detect use of foreign language symbols and terms. Still other processes and techniques are presented to counter obfuscating spammer tactics and to provide for efficient and accurate analysis of message content. Groups of similar content items (e.g., words, phrases, images, ASCII text, etc.) are correlated and analysis can proceed after substitution of items in the group with other items in the group so that a more accurate detection of “sameness” of content can be achieved. Dictionaries are used for spam or ham words or phrases. Other features are described.
-
Citations
20 Claims
-
1. A method for analyzing character codes in text, the method comprising:
-
parsing the character codes;
determining that a character code would create an undesirable message image when the character codes are displayed; and
processing the character code to produce translated text. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
-
-
18. An apparatus for analyzing character codes in text, the apparatus comprising:
-
a processor;
a machine-readable medium including instructions executable by the processor for parsing the character codes;
determining that a character code would create an undesirable message image when the character codes are displayed; and
processing the character code to produce translated text.
-
-
19. A computer-readable medium including instructions executable by a processor for analyzing character codes in text, the computer-readable medium comprising:
-
one or more instructions for parsing the character codes;
one or more instructions for determining that a character code would create an undesirable message image when the character codes are displayed; and
one or more instructions for processing the character code to produce translated text.
-
-
20. An apparatus for analyzing character codes in text, the apparatus comprising:
-
a processor;
a machine-readable medium including instructions executable by the processor for parsing the character codes;
determining that a character code would create an undesirable message image when the character codes are displayed; and
processing the character code to produce translated text.
-
Specification