×

Detecting spam documents in a phrase based information retrieval system

  • US 20060294155A1
  • Filed: 06/28/2006
  • Published: 12/28/2006
  • Est. Priority Date: 07/26/2004
  • Status: Active Grant
First Claim
Patent Images

1. A computer implemented method for identifying spam documents in an information retrieval system, the method comprising:

  • maintaining a list of phrases, each phrase associated with a list of related phrases;

    determining a number of related phrases expected to be present in a document for any phrase on the list of phrases;

    determining for a document, and for at least one phrase in the document, an actual number of related phrases present in the document; and

    identifying the document as a spam document by comparing the actual number of related phrases present in the document with the expected number of related phrases.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×