×

Using content analysis to detect spam web pages

  • US 7,962,510 B2
  • Filed: 02/11/2005
  • Issued: 06/14/2011
  • Est. Priority Date: 02/11/2005
  • Status: Active Grant
First Claim
Patent Images

1. A method comprising:

  • receiving content by crawling a web page;

    analyzing the content for web spam using a content-based identification technique,wherein the content-based identification technique comprises at least one of;

    determining a fraction of visible content to total content on the web page;

    ordetermining a ratio of compressed visible content to uncompressed visible content on the web page; and

    classifying the content according to said analysis.

View all claims
  • 3 Assignments
Timeline View
Assignment View
    ×
    ×