×

Apparatus and method for gathering of objectional web sites

  • US 20070005652A1
  • Filed: 03/21/2006
  • Published: 01/04/2007
  • Est. Priority Date: 07/02/2005
  • Status: Abandoned Application
First Claim
Patent Images

1. A harmful site collection apparatus comprising:

  • a start uniform resource locator (URL) database (DB) storing URLs of harmful web pages;

    a URL examination and distribution unit providing URLs grouped in relation to predetermined hosts, the URLs obtained by removing redundant URLs that are different to each other but indicate identical web pages, among the URLs stored in the start URL DB, and then among the remaining URLs, removing URLs corresponding to web sites already collected;

    a web site collection unit collecting web contents of the web sites corresponding to the URLs received from the URL examination and distribution unit; and

    a URL extraction unit extracting URLs in the links included in the web contents collected by the web site collection unit, identifying harmless URLs based on top-level domain names and a harmless URL list among the extracted URLs, and removing the identified harmless URLs from the URLs that are the object of the collection.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×