Rating and controlling access to emails
First Claim
1. A method of controlling access to offensive or harmful emails comprising:
- in conjunction with a program executing on a digital computer, examining a downloaded email before the email is displayed to the user;
said examining operation including analyzing the email natural language content relative to a predetermined database of regular expressions to form a rating, the database including regular expressions previously associated with offensive or harmful emails; and
the database further including a relative weighting associated with each regular expression in the database for use in forming the rating;
comparing the rating of the downloaded email to a predetermined threshold rating;
if the rating indicating that the downloaded email is more offensive or harmful than an email having the threshold rating, preventing the downloaded email from being displayed to the user; and
incrementally adjusting the weighting associated with each regular expression in the database based on error data accumulated from analyzing content of emails.
1 Assignment
0 Petitions
Accused Products
Abstract
Computer-implemented methods are described for, first, characterizing a specific category of information content—pornography, for example—and then accurately identifying instances of that category of content within a real-time media stream, such as a web page, e-mail or other digital dataset. This content-recognition technology enables a new class of highly scalable applications to manage such content, including filtering, classifying, prioritizing, tracking, etc. An illustrative application of the invention is a software product for use in conjunction with web-browser client software for screening access to web pages that contain pornography or other potentially harmful or offensive content. A target attribute set of regular expression, such as natural language words and/or phrases, is formed by statistical analysis of a number of samples of datasets characterized as “containing,” and another set of samples characterized as “not containing,” the selected category of information content. This list of expressions is refined by applying correlation analysis to the samples or “training data.” Neural-network feed-forward techniques are then applied, again using a substantial training dataset, for adaptively assigning relative weights to each of the expressions in the target attribute set, thereby forming an awaited list that is highly predictive of the information content category of interest.
-
Citations
24 Claims
-
1. A method of controlling access to offensive or harmful emails comprising:
-
in conjunction with a program executing on a digital computer, examining a downloaded email before the email is displayed to the user;
said examining operation including analyzing the email natural language content relative to a predetermined database of regular expressions to form a rating, the database including regular expressions previously associated with offensive or harmful emails; and
the database further including a relative weighting associated with each regular expression in the database for use in forming the rating;comparing the rating of the downloaded email to a predetermined threshold rating; if the rating indicating that the downloaded email is more offensive or harmful than an email having the threshold rating, preventing the downloaded email from being displayed to the user; and incrementally adjusting the weighting associated with each regular expression in the database based on error data accumulated from analyzing content of emails. - View Dependent Claims (2, 3)
-
-
4. A computer-readable medium storing a computer program for use in conjunction with a program to rate an email relative to unwanted commercial solicitations, the program comprising instructions to:
-
identify natural language textual portions of the email and form a list of words that appear in the identified natural language textual portions of the email; access a database of predetermined words that are associated with the unwanted commercial solicitations; acquire a corresponding weight from the database for each such word having a match in the database so as to form a weighted set of terms; calculate a rating for the email responsive to the weighted set of terms, the instructions to calculate including instructions to determine and take into account a total number of natural language words that appear in the identified natural language textual portions of the email; and incrementally adjusting the weighting associated with each regular expression in the database based on error data accumulated from analyzing content of emails. - View Dependent Claims (5, 6, 7, 8)
-
-
9. A method of analyzing content of an email, the method comprising:
-
identifying natural language textual portions of the email; forming a word listing including all natural language words that appear in the textual portion of the email; for each word in the word list, querying a preexisting database of selected words to determine whether or not a match exists in the database; for each word having a match in the database, reading a corresponding weight from the database so as to form a weighted set of terms; calculating a rating for the email responsive to the weighted set of term; and
incrementally adjusting the weighting associated with each regular expression in the database based on error data accumulated from analyzing content of emails. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A method of controlling access to emails including an unwanted commercial solicitation comprising:
-
in conjunction with a program executing on a digital computer, examining a downloaded email before the email is displayed to the user;
said examining operation including analyzing the email natural language content relative to a predetermined database of regular expressions to form a rating, the database including regular expressions relating to unwanted commercial solicitations; and
the database further including a relative weighting associated with each regular expression in the database for use in forming the rating;comparing the rating of the downloaded email to a predetermined threshold rating; if the rating indicated that the downloaded email is more likely to include an unwanted commercial solicitation than an email having the threshold rating, preventing the downloaded email from being displayed to the user; and incrementally adjusting the weighting associated with each regular expression in the database based on error data accumulated from analyzing content of emails. - View Dependent Claims (20, 21, 22, 23, 24)
-
Specification