AUTOMATED DETECTION OF DECEPTION IN SHORT AND MULTILINGUAL ELECTRONIC MESSAGES
First Claim
Patent Images
1. A method of detecting deception in electronic messages, comprising:
- (a) obtaining a first set of electronic messages;
(b) subjecting the first set to model-based clustering analysis to identify training data;
(c) building a first suffix tree using the training data for deceptive messages;
(d) building a second suffix tree using the training data for non-deceptive messages;
(e) assessing an electronic message to be evaluated via comparison of the message to the first and second suffix trees and scoring the degree of matching to both to classify the message as deceptive or non-deceptive based upon the respective scores.
1 Assignment
0 Petitions
Accused Products
Abstract
A method and apparatus for automatically identifying harmful electronic messages, such as those presented in emails, on Craigslist or on Twitter, Facebook and other social media websites, features methodology for discriminating unwanted garbage communications (spam) and unwanted deceptive messages (scam) from wanted, truthful communications based upon patterns discernable from samples of each type of electronic communication. Methods are proposed that enable discrimination of wanted from unwanted communications in short electronic messages, such as on Twitter and for multilingual application.
407 Citations
14 Claims
-
1. A method of detecting deception in electronic messages, comprising:
-
(a) obtaining a first set of electronic messages; (b) subjecting the first set to model-based clustering analysis to identify training data; (c) building a first suffix tree using the training data for deceptive messages; (d) building a second suffix tree using the training data for non-deceptive messages; (e) assessing an electronic message to be evaluated via comparison of the message to the first and second suffix trees and scoring the degree of matching to both to classify the message as deceptive or non-deceptive based upon the respective scores. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A method of detecting deception in an electronic message M, comprising the steps of:
-
(a) building training files D of deceptive messages and T of truthful messages; (b) building suffix trees SD and ST for files D and T, respectively; (c) traversing suffix trees SD and ST and determining different combinations and adaptive context; (d) determining the cross-entropy ED and ET between the electronic message M and each of the suffix trees SD and ST, respectively;
thenif ED>
ET, classify Message M as deceptive;
orif ET>
ED, classify message M as truthful.
-
-
10. A method for automatically categorizing an electronic message in a foreign language as wanted or unwanted, comprising the steps of:
-
(a) collecting a sample corpus of a plurality of wanted and unwanted messages in a domestic language with known categorization as wanted or unwanted; (b) testing the corpus in the domestic language by an automated testing method to discern wanted and unwanted messages and scoring detection effectiveness associated with the automated testing method by comparing the automatic testing categorization results to the known categorization; (c) translating the corpus into a foreign language with a translation tool; (d) testing the corpus in the foreign language by the automated testing method and scoring detection effectiveness associated with the automated testing method; (e) if the detection effectiveness score in the foreign language indicates acceptable detection accuracy, then using the testing method and the translation tool to categorize electronic messages as wanted or unwanted. - View Dependent Claims (11, 12, 13)
-
-
14. A system for detecting deception in communications, comprising:
-
a computer programmed with software that automatically analyzes a text message in digital form for deceptiveness by at least one of statistical analysis of text content to ascertain and evaluate pscho-linguistic cues that are present in the text message, authorship similarity analysis, and analysis to detect coded/camouflages messages, and a computer having means to obtain the text message in digital form and store the text message within a memory of said computer, and the computer having means to access truth data against which the veracity of the text message can be compared and a graphical user interface through which a user of said system can control said system and receive results concerning the deceptiveness of the text message analyzed by said system.
-
Specification