×

Method and system for filtering content in a discovered topic

  • US 7,146,359 B2
  • Filed: 05/03/2002
  • Issued: 12/05/2006
  • Est. Priority Date: 05/03/2002
  • Status: Expired due to Fees
First Claim
Patent Images

1. A method of content filtering in a discovered topic comprising:

  • collecting querying data contained in a log said querying data having caused a retrieval of a collection of documents;

    preprocessing said querying data, wherein said preprocessing comprises;

    cleaning said querying data;

    transforming said querying data into a querying data vector; and

    clustering said querying data based on said querying data vector; and

    postfiltering a collection of documents, said postfiltering comprising;

    collecting actual document content data, said actual document content data related to documents which were retrieved based on said querying data contained in said log;

    preprocessing said actual document content data, wherein said preprocessing comprises;

    cleaning said actual document content data;

    transforming said actual document content data into a document content data vector; and

    clustering said collection of actual documents content data based on said document content data vector;

    wherein said postfiltering performs a similarity computation between said querying data cluster and said actual content data cluster to generate a collection of documents having content similar to said querying data wherein extraneous subject matter documents are excluded from said collection of documents.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×