Detection and handling of aggregated online content using decision criteria to compare similar or identical content items
First Claim
1. A method for evaluating online content items, the method comprising:
- acquiring a first online content item from an online source, by a computer system using conventional webcrawling techniques, wherein the first content item is obtained via network connection;
generating, by the computer system, a characterizing signature for the first content item;
searching a cache memory architecture of the computer system for an instance of the characterizing signature;
identifying, by the computer system, first RSS feed data associated with the first online content item and second RSS feed data associated with a second online content item, wherein the second RSS feed data is identified from the cache memory architecture, and wherein the second online content item corresponds to the instance of the characterizing signature saved in the cache memory architecture;
evaluating, by the computer system, the first RSS feed data and the second RSS feed data; and
determining, by the computer system, whether the first online content item or the second online content item comprises a content aggregator, based on the evaluating, wherein the content aggregator comprises a website presenting a duplicate version of original online content obtained from legitimate sources of the original online content.
1 Assignment
0 Petitions
Accused Products
Abstract
A computer-implemented method is presented herein. The method obtains a first content item from an online source, and then generates a characterizing signature of the first content item. The method continues by finding a previously-saved instance of the characterizing signature and retrieving data associated with a second content item (the second content item is characterized by the characterizing signature). The method continues by analyzing the data associated with the second content item, corresponding data associated with the first content item, and decision criteria. Thereafter, either the first content item or the second content item is identified as an original content item, based on the analyzing. The other content item can be flagged as an aggregated content item.
157 Citations
20 Claims
-
1. A method for evaluating online content items, the method comprising:
-
acquiring a first online content item from an online source, by a computer system using conventional webcrawling techniques, wherein the first content item is obtained via network connection; generating, by the computer system, a characterizing signature for the first content item; searching a cache memory architecture of the computer system for an instance of the characterizing signature; identifying, by the computer system, first RSS feed data associated with the first online content item and second RSS feed data associated with a second online content item, wherein the second RSS feed data is identified from the cache memory architecture, and wherein the second online content item corresponds to the instance of the characterizing signature saved in the cache memory architecture; evaluating, by the computer system, the first RSS feed data and the second RSS feed data; and determining, by the computer system, whether the first online content item or the second online content item comprises a content aggregator, based on the evaluating, wherein the content aggregator comprises a website presenting a duplicate version of original online content obtained from legitimate sources of the original online content. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A computing system for evaluating online content items, the computing system comprising:
-
system memory comprising a cache memory architecture configured to store instances of characterizing signatures associated with online content items; and at least one processor, communicatively coupled to the system memory, the at least one processor configured to; acquire a first online content item from an online source, using conventional webcrawling techniques, wherein the first content item is obtained via network connection; generate a characterizing signature for the first content item; search a cache memory architecture of the computer system for an instance of the characterizing signature; identify first RSS feed data associated with the first online content item and second RSS feed data associated with a second online content item, wherein the second RSS feed data is identified from the cache memory architecture, and wherein the second online content item corresponds to the instance of the characterizing signature saved in the cache memory architecture; assess first RSS feed data associated with a first online content item and second RSS feed data associated with a second online content item; and determine whether the first online content item or the second online content item comprises a content aggregator, based on the assessment, wherein the content aggregator comprises a website presenting a duplicate version of original online content obtained from legitimate sources of the original online content. - View Dependent Claims (10, 11, 12, 13, 14, 15)
-
-
16. A non-transitory, computer-readable medium containing instructions thereon, which, when executed by a processor, are capable of performing a method comprising:
-
acquiring a first online content item from an online source, by the processor using conventional webcrawling techniques, wherein the first content item is obtained via network connection; generating, by the processor, a characterizing signature for the first content item; searching a cache memory architecture communicatively coupled to the processor for an instance of the characterizing signature; identifying, by the processor, first RSS feed data associated with the first online content item and second RSS feed data associated with a second online content item, wherein the second RSS feed data is identified from the cache memory architecture, and wherein the second online content item corresponds to the instance of the characterizing signature saved in the cache memory architecture; evaluating RSS feed data associated with a plurality of online content items; and identifying, by the processor, at least one of the plurality of online content items as a content aggregator, based on the evaluating, wherein the content aggregator comprises a website presenting a duplicate version of original online content obtained from legitimate sources of the original online content. - View Dependent Claims (17, 18, 19, 20)
-
Specification