System and method of analyzing web content
First Claim
Patent Images
1. A computer-implemented method of identifying inappropriate web content, the method comprising:
- receiving a request for web content;
comparing the request to data in a database;
sending the request to a collection module if the request is not in the database;
collecting, by the collection module, data related to the request; and
determining a candidate status for the request based on the collected data.
21 Assignments
0 Petitions
Accused Products
Abstract
A system and method are provided for identifying inappropriate content in websites on a network. Unrecognized uniform resource locators (URLs) or other web content are accessed by workstations and are identified as possibly having malicious content. The URLs or web content may be preprocessed within a gateway server module or some other software module to collect additional information related to the URLs. The URLs may be scanned for known attack signatures, and if any are found, they may be tagged as candidate URLs in need of further analysis by a classification module.
-
Citations
28 Claims
-
1. A computer-implemented method of identifying inappropriate web content, the method comprising:
-
receiving a request for web content; comparing the request to data in a database; sending the request to a collection module if the request is not in the database; collecting, by the collection module, data related to the request; and determining a candidate status for the request based on the collected data. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
-
-
15. A system for selecting candidate URLs from a set of uncategorized URLs, the system comprising:
-
a database storing the uncategorized URLs; a collection system configured to collect information related to the uncategorized URLs; and a data mining module configured to identify uncategorized URLs having a characteristic indicative of targeted content. - View Dependent Claims (16, 17, 18, 19, 20)
-
-
21. A computer-implemented method of collecting data about URLs, the method comprising:
-
providing a data mining module with a configuration plug-in, the data mining module having a plurality of dispatchers configured to operate independently of each other; receiving URL data into the data mining module for analysis; separating the URL data into work units, each work unit comprising a URL; determining whether one of the plurality of dispatchers is available for receiving a work unit; sending one of the work units to one of the dispatchers if available; and processing the sent work unit based on data provided by the configuration plug-in. - View Dependent Claims (22, 23, 24, 25)
-
-
26. A system for collecting data about URLs, the system comprising:
-
a database for storing information about URLs; a pool of dispatchers, the dispatchers comprising asynchronous system processes each configured to receive URL data input and perform actions on the data; and a driver module configured to monitor the pool of dispatchers for available dispatchers, and send part of the URL data input to the available dispatchers. - View Dependent Claims (27)
-
-
28. A system for identifying candidate URLs from a set of uncategorized URLs, the system comprising:
-
means for storing the uncategorized URLs; means for collecting information related to the uncategorized URLs; and means for identifying the uncategorized URLs having a characteristic indicative of targeted content.
-
Specification