×

Methods, apparatus and computer program products for information retrieval and document classification utilizing a multidimensional subspace

  • US 6,701,305 B1
  • Filed: 10/20/2000
  • Issued: 03/02/2004
  • Est. Priority Date: 06/09/1999
  • Status: Expired due to Term
First Claim
Patent Images

1. A method of retrieving information from a text data collection that comprises a plurality of documents with each document comprised of a plurality of terms, wherein the text data collection is represented by a term-by-document matrix having a plurality of entries with each entry being the frequency of occurrence of a term in a respective document, and wherein the method comprises:

  • receiving a query;

    projecting a representation of at least a portion of the term-by-document matrix into a lower dimensional subspace to thereby create at least those portions of a subspace representation Ak relating to a term identified by the query;

    weighting at least those portions of a subspace representation Ak relating to a term identified by the query following the projection into the lower dimensional subspace;

    scoring the plurality of documents with respect to the query based at least partially upon the weighted portion of the subspace representation Ak; and

    identifying respective documents based upon relative scores of the documents with respect to the query.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×