Method and Apparatus for Detecting Spam User Created Content
First Claim
1. A method for processing spam contents, comprising the steps of:
- maintaining a plurality of key information databases;
receiving user-created content at least one of a service identifier (ID) and a content category ID of said user-created content from one or more users of a user-created content hosting site;
selecting at least one of the plurality of key information databases based on at least one of the received service ID and the received content category ID;
extracting second key information from the received user-created content;
searching the selected key information database to retrieve first key information related to the second key information;
classifying the user-created content as spam content based on the extracted second key information and/or the retrieved first key information related to the second key information; and
conditionally storing the user-created content in a network accessible data store available to users of the user-created content hosting site based on classifying the user-created content as spam or non-spam content.
9 Assignments
0 Petitions
Accused Products
Abstract
The present invention provides methods, apparatuses and systems directed to automatically detecting spam user created content. In a particular implementation, there is provided a method for processing spam contents, which comprises: maintaining a plurality of key information databases; receiving user-created content and at least one of a service ID and a content category ID of the user-created content from one or more users of a user-created content hosting site; selecting one of the plurality of key information databases based on at least one of the service ID and the content category ID; extracting second key information from the received user-created content; searching the selected key information database for first key information related to the second key information; classifying the received user-created content as spam content based on the extracted second key information and/or the first key information related to the second key information; and conditionally storing the user-created content in a network accessible data store available to users of the user-created content hosting site based on classifying the user-created content as spam or non-spam content. Said first and second key information may comprise at least one of predetermined type(s) of data, word(s) and phrase(s) in said contents, wherein said data comprises a user ID, a universal resource locator, a site address, an account number and/or a telephone number. In addition, said method may further comprise: determining whether the extracted second key information corresponds to predefined restricted information; and if the extracted second key information corresponds to the predefined restricted information, removing the extracted second key information and/or replacing the extracted second key information with predefined different information.
46 Citations
25 Claims
-
1. A method for processing spam contents, comprising the steps of:
-
maintaining a plurality of key information databases; receiving user-created content at least one of a service identifier (ID) and a content category ID of said user-created content from one or more users of a user-created content hosting site; selecting at least one of the plurality of key information databases based on at least one of the received service ID and the received content category ID; extracting second key information from the received user-created content; searching the selected key information database to retrieve first key information related to the second key information; classifying the user-created content as spam content based on the extracted second key information and/or the retrieved first key information related to the second key information; and conditionally storing the user-created content in a network accessible data store available to users of the user-created content hosting site based on classifying the user-created content as spam or non-spam content. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. Logic encoded in one or more tangible media for execution and when executed operable to cause the one or more processors to:
-
maintain a plurality of key information databases; receive user-created content at least one of a service identifier (ID) and a content category ID of said user-created content from one or more users of a user-created content hosting site; select at least one of the plurality of key information databases based on at least one of the received service ID and the received content category ID; extract second key information from the received user-created content; search the selected key information database to retrieve first key information related to the second key information; classify the user-created content as spam content based on the extracted second key information and/or the retrieved first key information related to the second key information; and conditionally store the user-created content in a network accessible data store available to users of the user-created content hosting site based on classifying the user-created content as spam or non-spam content.
-
-
13. An apparatus for processing spam contents, said apparatus comprising:
-
a storage part configured to include a plurality of key information databases; a communication part configured to receive user-created content and at least one of a service ID and a content category ID from one or more of users of a user-created content hosting site; and a control part configured to select one of the plurality of key information databases based on at least one of the service ID and the content category ID, extract second key information from the received user-created content, search the selected key information database to retrieve first key information related to the extracted second key information, classify the received user-created content as spam content based on the extracted second key information and/or the retrieved first key information related to the first key information, and conditionally store the user-created content in a network accessible data store available to users of the user-created content hosting site based on classifying the user-created content as spam or non-spam content. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20, 21, 22, 23)
-
-
24. A method for processing spam contents, comprising the steps of:
-
maintaining a plurality of key information databases; receiving user-created content and at least one of a service ID and a content category ID of the user-created content from one or more users of a user-created content hosting site; selecting one of the plurality of key information databases based on at least one of the service ID and the content category ID; extracting second key information from the received user-created content; searching the selected key information database to retrieve first key information matching the second key information; if a match is found, classifying the user-created content as spam content; and conditionally storing the user-created content in a network accessible data store available to users of the user-created content hosting site based on classifying the user-created content as spam or non-spam content.
-
-
25. An apparatus for processing spam contents, said apparatus comprising:
-
a storage part configured to include a key information database; a communication part configured to receive user-created content and at least one of a service ID and a content category ID of the user-created content from one or more users of a user-created content hosting site; and a control part configured to select one of the plurality of key information databases based on at least one of the service ID and the content category ID, extract second key information from the received user-created content, search the selected key information database to retrieve first key information matching the second key information, if a match is found, classify the user-created content as spam content and conditionally store the user-created content in a network accessible data store available to users of the user-created content hosting site based on classifying the user-created content as spam or non-spam content.
-
Specification