SPAM DETECTION FOR ONLINE SLIDE DECK PRESENTATIONS
First Claim
1. A computer-implemented method comprising:
- receiving an electronic presentation, the electronic presentation comprising a plurality of slides, wherein at least one slide contains content for viewing by a user;
extracting content from a slide selected from the plurality slides based on a determination that the selected slide contains content;
determining a plurality of features for each slide of the plurality of slides based on the content extracted from a corresponding slide;
assigning a classification to each slide based on the features determined for the corresponding slide, the assigned classification identifying the type of content contained within the corresponding slide;
applying a filter to each slide based on the features determined for the slide, the applied filter identifying whether the slide contains a predetermined plurality of alphanumeric characters;
determining whether each slide of the plurality of slides contains spam based on the applied filter and assigned classification to the slide;
adjusting the spam determination of each slide of the plurality of slides based on a location of a corresponding slide relative to the plurality of slides of the electronic presentation; and
determining whether the electronic presentation is spam based on the adjusted spam determination for each slide of the plurality of slides.
2 Assignments
0 Petitions
Accused Products
Abstract
The disclosed systems and methods are directed to detecting spam in an electronic presentation and determining whether the electronic presentation should be moderated. The example systems and methods may employ one or more classifiers for classifying an electronic presentation and, should the electronic presentation fall within a predetermined classification, the electronic presentation may be analyzed further for the presence of spam. Further analysis of the electronic presentation may include invoking one or more filters to determine whether the electronic presentation includes words and/or phrases known to be associated with spam. Where the electronic presentation is determined to contain spam, the electronic presentation may be removed from a database of electronic presentations, excluded from search results, or flagged for moderation by a moderator.
-
Citations
20 Claims
-
1. A computer-implemented method comprising:
-
receiving an electronic presentation, the electronic presentation comprising a plurality of slides, wherein at least one slide contains content for viewing by a user; extracting content from a slide selected from the plurality slides based on a determination that the selected slide contains content; determining a plurality of features for each slide of the plurality of slides based on the content extracted from a corresponding slide; assigning a classification to each slide based on the features determined for the corresponding slide, the assigned classification identifying the type of content contained within the corresponding slide; applying a filter to each slide based on the features determined for the slide, the applied filter identifying whether the slide contains a predetermined plurality of alphanumeric characters; determining whether each slide of the plurality of slides contains spam based on the applied filter and assigned classification to the slide; adjusting the spam determination of each slide of the plurality of slides based on a location of a corresponding slide relative to the plurality of slides of the electronic presentation; and determining whether the electronic presentation is spam based on the adjusted spam determination for each slide of the plurality of slides. - View Dependent Claims (2, 3, 4, 5, 8)
-
- 6. The computer-implemented method of claim 6, wherein modifying the electronic presentation comprises removing the electronic presentation from being discoverable by a search query applied to a plurality of electronic presentations.
-
9. A system comprising:
-
a non-transitory, computer-readable medium storing computer-executable instructions; and one or more processors in communication with the non-transitory, computer-readable medium that, having executed the computer-executable instructions, are configured to; receive an electronic presentation, the electronic presentation containing a plurality of slides, wherein at least one slide contains content for viewing by a user; for each slide of the plurality of slides, determine a plurality of features for a corresponding slide, the determined features based on content extracted from the corresponding slide; assign at least one classification to each slide of the plurality of slides based on the features determined for the corresponding slide; determine whether a filter is satisfied for each slide of the plurality of slides, the filter identifying whether a given slide includes a plurality of alphanumeric characters; determine a spam value for each slide of the plurality of slides, the spam value based on the assigned classification for the corresponding slide, whether the filter was satisfied for the corresponding slide, and a location of the corresponding slide relative to the plurality of slides; and determine an overall spam value for the electronic presentation, the overall spam value based on each spam value determined for each slide of the plurality of slides. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
-
17. A non-transitory, computer-readable medium storing computer-executable instructions thereon that, when executed by one or more processors, cause the one or more processors to perform a method, the method comprising:
-
receiving an electronic presentation, the electronic presentation comprising a plurality of slides, wherein at least one slide contains content for viewing by a user; extracting content from a slide selected from the plurality slides based on a determination that the selected slide contains content; determining a plurality of features for each slide of the plurality of slides based on the content extracted from a corresponding slide; assigning a classification to each slide based on the features determined for the corresponding slide, the assigned classification identifying the type of content contained within the corresponding slide; applying a filter to each slide based on the features determined for the slide, the applied filter identifying whether the slide contains a predetermined plurality of alphanumeric characters; determining whether each slide of the plurality of slides contains spam based on the applied filter and assigned classification to the slide; adjusting the spam determination of each slide of the plurality of slides based on a location of a corresponding slide relative to the plurality of slides of the electronic presentation; and determining whether the electronic presentation is spam based on the adjusted spam determination for each slide of the plurality of slides. - View Dependent Claims (18, 19, 20)
-
Specification