Identifying Malicious Text In Advertisement Content
First Claim
1. A method comprising:
- retrieving text included in advertisement content of an advertisement (“
ad”
) request for presentation to a user of an online system;
identifying one or more words included in the advertisement content;
identifying one or more characters included in each of the identified one or more words;
determining a type associated with each of the identified one or more characters;
determining a score associated with each word of the identified one or more words based at least in part on types associated with one or more characters identified in a word;
determining whether the advertisement content is malicious based at least in part on the determined scores associated with each word of the identified one or more words; and
determining whether the advertisement content is eligible for presentation to the user of the online system based at least in part on the determination of whether the advertisement content is malicious.
2 Assignments
0 Petitions
Accused Products
Abstract
An online system receives advertisement requests from one or more advertisers and determines whether an advertisement request includes malicious content before presenting content from the advertisement request to a user. To determine whether the advertisement request includes malicious content, the online system identifies text in the advertisement request, identifies words in the text, and identifies characters in each word. The online system identifies a most common type of character in each word and generates a score for each word based on its constituent characters. For example, a word'"'"'s score is based on the combination of characters in the word, such as a conditional probability of a word including a type of character given that the word includes a given number of the most common type of character. The scores are analyzed to determine if text in the advertisement request includes malicious content.
-
Citations
20 Claims
-
1. A method comprising:
-
retrieving text included in advertisement content of an advertisement (“
ad”
) request for presentation to a user of an online system;identifying one or more words included in the advertisement content; identifying one or more characters included in each of the identified one or more words; determining a type associated with each of the identified one or more characters; determining a score associated with each word of the identified one or more words based at least in part on types associated with one or more characters identified in a word; determining whether the advertisement content is malicious based at least in part on the determined scores associated with each word of the identified one or more words; and determining whether the advertisement content is eligible for presentation to the user of the online system based at least in part on the determination of whether the advertisement content is malicious. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A method comprising:
-
retrieving text included in advertisement content of an advertisement (“
ad”
) request for presentation to a user of an online system;identifying one or more words included in the advertisement content; identifying a one or more types associated with one or more characters in each of the identified one or more words; scoring each word from the identified one or more words based at least in part on a probability associated with each of the identified one or more words, the probability associated with a word based at least in part on the one or more types associated with one or more characters in the word; determining whether the advertisement content includes malicious content based at least in part on the determined scores; and determining whether the advertisement content is eligible for presentation to the user of the online system based at least in part on the determination of whether the advertisement content includes malicious content. - View Dependent Claims (13, 14, 15, 16, 17, 18)
-
-
19. A computer program product comprising a computer-readable storage medium having instructions encoded thereon that, when executed by a processor, cause the processor to:
-
retrieve text included in advertisement content of an advertisement (“
ad”
) request for presentation to a user of an online system;identify one or more words included in the advertisement content; identifying a one or more types associated with one or more characters in each of the identified one or more words; score each word from the identified one or more words based at least in part on a probability associated with each of the identified one or more words, the probability associated with a word based at least in part on the one or more types associated with one or more characters in the word; determine whether the advertisement content includes malicious content based at least in part on the determined scores; and determine whether the advertisement content is eligible for presentation to the user of the online system based at least in part on the determination of whether the advertisement content includes malicious content. - View Dependent Claims (20)
-
Specification