System, method and computer program product for comprehensive collusion detection and network traffic quality prediction
First Claim
1. A system, comprising:
- a modeling module embodied on a non-transitory computer readable medium for processing historical click data received from a client and generating one or more models, wherein each of the one or more models formulates potential collusion among entities in the historical click data as a graph partitioning problem, a vector space clustering problem, or a combination thereof;
a toolkit comprising a plurality of heuristics for solving problems formulated by the modeling module, the problems including graph partitioning problems, vector space clustering problems, or a combination thereof, wherein the plurality of heuristics comprises;
a first set of heuristics for solving graph partitioning problems formulated by the modeling module to generate subgraphs of connected nodes representing entities involved in suspicious activities;
a second set of heuristics for solving vector space clustering problems formulated by the modeling module to generate high dimensional vector space clusters or groups of entities having similar patterns over a period of time; and
a third set of heuristics for transforming graphs into vector spaces and performing clustering associated therewith; and
a post processor comprising a set of rules for filtering results from the toolkit, extracting entities of interest, and placing the entities of interest on global block lists, wherein filtering the results comprises eliminating subgraphs that do not meet a density requirement and removing known entities to eliminate or reduce false positives.
4 Assignments
0 Petitions
Accused Products
Abstract
Embodiments disclosed herein seamlessly integrate several components into a comprehensive collusion detection and traffic quality prediction system, including a strong modeling module for processing historical click data and transforming potential collusions hidden therein into solvable graph partitioning (network) and/or vector space clustering (pattern) models, a scalable and robust toolkit comprising a plurality of graph partitioning and clustering heuristics for analyzing and generating high density subgraphs and high dimensional clusters or groups, and a post processing module for extracting entities from the subgraphs and clusters and placing them on global block lists. Entities thus listed can be blocked from client networks in real time. As such, high traffic quality can be predicted. A job scheduler may schedule individual jobs from the modeling module based on the number of available resources in a distributed computing environment to minimize completion time while balancing load.
-
Citations
20 Claims
-
1. A system, comprising:
-
a modeling module embodied on a non-transitory computer readable medium for processing historical click data received from a client and generating one or more models, wherein each of the one or more models formulates potential collusion among entities in the historical click data as a graph partitioning problem, a vector space clustering problem, or a combination thereof; a toolkit comprising a plurality of heuristics for solving problems formulated by the modeling module, the problems including graph partitioning problems, vector space clustering problems, or a combination thereof, wherein the plurality of heuristics comprises; a first set of heuristics for solving graph partitioning problems formulated by the modeling module to generate subgraphs of connected nodes representing entities involved in suspicious activities; a second set of heuristics for solving vector space clustering problems formulated by the modeling module to generate high dimensional vector space clusters or groups of entities having similar patterns over a period of time; and a third set of heuristics for transforming graphs into vector spaces and performing clustering associated therewith; and a post processor comprising a set of rules for filtering results from the toolkit, extracting entities of interest, and placing the entities of interest on global block lists, wherein filtering the results comprises eliminating subgraphs that do not meet a density requirement and removing known entities to eliminate or reduce false positives. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A method, comprising:
-
at a server computer in a distributed computing environment, receiving historical click data from a client computer connected to the distributed computing environment over a network; the server computer processing the historical click data and generating one or more models, wherein each of the one or more models formulates potential collusion among entities in the historical click data as a graph partitioning problem, a vector space clustering problem, or a combination thereof, and wherein a modeling module performs the formulating; the server computer applying a plurality of heuristics to solve problems formulated by the modeling module, the problems including graph partitioning problems, vector space clustering problems, or a combination thereof, wherein the plurality of heuristics comprises; a first set of heuristics for solving graph partitioning problems formulated by the modeling module to generate subgraphs of connected nodes representing entities involved in suspicious activities; a second set of heuristics for solving vector space clustering problems formulated by the modeling module to generate high dimensional vector space clusters or groups of entities having similar patterns over a period of time; and a third set of heuristics for transforming graphs into vector spaces and performing clustering associated therewith; and the server computer post processing results from the applying step, wherein the post processing comprises; eliminating subgraphs that do not meet a density requirement; removing known entities to eliminate or reduce false positives; extracting entities of interest from high density subgraphs and high dimensional vector space clusters or groups; and placing the entities of interest on global block lists. - View Dependent Claims (8, 9, 10, 11, 12, 13)
-
-
14. A computer program product comprising at least one non-transitory computer readable medium storing computer instructions translatable by at least one processor to implement:
-
a modeling module for processing historical click data received from a client and generating one or more models, wherein each of the one or more models formulates potential collusion among entities in the historical click data as a graph partitioning problem, a vector space clustering problem, or a combination thereof; a toolkit comprising a plurality of heuristics for solving problems formulated by the modeling module, the problems including graph partitioning problems, vector space clustering problems, or a combination thereof, wherein the plurality of heuristics comprises; a first set of heuristics for solving graph partitioning problems formulated by the modeling module to generate subgraphs of connected nodes representing entities involved in suspicious activities; a second set of heuristics for solving vector space clustering problems formulated by the modeling module to generate high dimensional vector space clusters or groups of entities having similar patterns over a period of time; and a third set of heuristics for transforming graphs into vector spaces and performing clustering associated therewith; and a post processor comprising a set of rules for filtering results from the toolkit, extracting entities of interest, and placing the entities of interest on global block lists, wherein filtering the results comprises eliminating subgraphs that do not meet a density requirement and removing known entities to eliminate or reduce false positives. - View Dependent Claims (15, 16, 17, 18, 19, 20)
-
Specification