Spam detection for user-generated multimedia items based on concept clustering
First Claim
Patent Images
1. A computer-implemented method for processing a multimedia item, comprising:
- storing the multimedia item and associated metadata in a multimedia database, wherein the metadata includes a plurality of tokens;
determining, for each of the plurality of tokens, a concept associated with the token;
determining a number of related concepts from concepts determined from the plurality of tokens included in the metadata;
clustering the related concepts into corresponding clusters; and
responsive to a number of distinct clusters exceeding a threshold, marking the multimedia item as spam.
2 Assignments
0 Petitions
Accused Products
Abstract
A system, a method, and various software tools enable a video hosting website to automatically identify posted video items that contain spam in the metadata associated with a respective video item. A spam detection tool for user-generated video items based on concept clustering is provided that facilitates the detection of spam in the metadata associated with a video item.
80 Citations
20 Claims
-
1. A computer-implemented method for processing a multimedia item, comprising:
-
storing the multimedia item and associated metadata in a multimedia database, wherein the metadata includes a plurality of tokens; determining, for each of the plurality of tokens, a concept associated with the token; determining a number of related concepts from concepts determined from the plurality of tokens included in the metadata; clustering the related concepts into corresponding clusters; and responsive to a number of distinct clusters exceeding a threshold, marking the multimedia item as spam. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A computer-implemented method for processing a multimedia item, comprising:
-
storing the multimedia item and associated metadata in a multimedia database, wherein the metadata includes a plurality of tokens; determining, for each of the plurality of tokens, a concept associated with the token; determining a number of related concepts from concepts determined from the plurality of tokens included in the metadata; clustering the related concepts into corresponding clusters; and responsive to determining at least one combination of distinct clusters, marking the multimedia item as spam. - View Dependent Claims (9, 10, 11)
-
-
12. A system for processing a multimedia item, the system comprising:
-
a computer processor; and a computer-readable storage medium storing executable code, the code when executed by the processor performs steps comprising; storing the multimedia item and associated metadata in a multimedia database, wherein the metadata includes a plurality of tokens; determining, for each of the plurality of tokens, a concept associated with the token; determining a number of related concepts concepts from concepts determined from the plurality of tokens included in the metadata; clustering the related concepts into corresponding clusters; and marking the multimedia item as spam responsive to a number of distinct clusters exceeding a threshold. - View Dependent Claims (13, 14, 15)
-
-
16. A non-transitory computer-readable storage medium containing program code for processing a multimedia item, the program code when executed performs steps comprising:
-
storing the multimedia item and associated metadata in a multimedia database, wherein the metadata includes a plurality of tokens; determining, for each of the plurality of tokens, a concept associated with the token; determining a number of related concepts from concepts determined from the plurality of tokens included in the metadata; clustering the related concepts into corresponding clusters; and responsive to a number of distinct clusters exceeding a threshold, marking the multimedia item as spam. - View Dependent Claims (17, 18, 19, 20)
-
Specification