Spam detection for user-generated multimedia items based on concept clustering
First Claim
Patent Images
1. A computer-implemented method for processing a video, comprising:
- storing the video and associated metadata in a multimedia database of a hosting website, wherein the metadata includes a plurality of tokens that are provided from a user who uploaded the video to the hosting website;
determining, for each of the plurality of tokens, a concept associated with the token;
determining a number of concepts related to content of the video and a number of distinct concepts unrelated to the content of the video from concepts determined from the plurality of tokens; and
responsive to the number of distinct concepts unrelated to the content of the video exceeding a threshold, marking the video as spam.
2 Assignments
0 Petitions
Accused Products
Abstract
A system, a method, and various software tools enable a video hosting website to automatically identify posted video items that contain spam in the metadata associated with a respective video item. A spam detection tool for user-generated video items based on concept clustering is provided that facilitates the detection of spam in the metadata associated with a video item.
-
Citations
14 Claims
-
1. A computer-implemented method for processing a video, comprising:
-
storing the video and associated metadata in a multimedia database of a hosting website, wherein the metadata includes a plurality of tokens that are provided from a user who uploaded the video to the hosting website; determining, for each of the plurality of tokens, a concept associated with the token; determining a number of concepts related to content of the video and a number of distinct concepts unrelated to the content of the video from concepts determined from the plurality of tokens; and responsive to the number of distinct concepts unrelated to the content of the video exceeding a threshold, marking the video as spam. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A computer-implemented method for processing a video, comprising:
-
storing the video and associated metadata in a multimedia database of a hosting website, wherein the metadata includes a plurality of tokens that are provided from a user who uploaded the video to the hosting website; determining, for each of the plurality of tokens, a concept associated with the token; determining a number of concepts related to content of the video and a number of distinct concepts unrelated to the content of the video from concepts determined from the plurality of tokens; and responsive to determining at least one combination of distinct concepts unrelated to the content of the video, marking the video as spam. - View Dependent Claims (7, 8)
-
-
9. A system for processing a video, the system comprising:
-
a computer processor; and a computer-readable storage medium storing executable code, the code when executed by the processor performs steps comprising; storing the video and associated metadata in a multimedia database of the system, wherein the metadata includes a plurality of tokens that are provided from a user who uploaded the video to the system; determining, for each of the plurality of tokens, a concept associated with the token; determining a number of concepts related to content of the video and a number of distinct concepts unrelated to the content of the video from concepts determined from the plurality of tokens; and marking the video as spam responsive to the number of distinct concepts unrelated to the content of the video exceeding a threshold. - View Dependent Claims (10, 11)
-
-
12. A non-transitory computer-readable storage medium containing program code for processing a video, the program code for:
-
storing the video and associated metadata in a multimedia database of a hosting website, wherein the metadata includes a plurality of tokens that are provided from a user who uploaded the video to the hosting website; determining, for each of the plurality of tokens, a concept associated with the token; determining a number of concepts related to content of the video and a number of distinct concepts unrelated to the content of the video from concepts determined from the plurality of tokens; and responsive to the number of distinct concepts unrelated to the content of the video exceeding a threshold, marking the video as spam. - View Dependent Claims (13, 14)
-
Specification