Please download the dossier by clicking on the dossier button x
×

Propagating relevance from labeled documents to unlabeled documents

  • US 8,019,763 B2
  • Filed: 02/27/2006
  • Issued: 09/13/2011
  • Est. Priority Date: 02/27/2006
  • Status: Expired due to Fees
First Claim
Patent Images

1. A computing device with a processor and memory for propagating relevance of labeled documents to unlabeled documents, comprising:

  • a document store that contains representations of documents, some of the documents being labeled with relevance to a query and others of the documents not being labeled with relevance to the query;

    a graph component that creates a graph of the documents with the documents represented as nodes being connected by edges representing similarity between documents having content, includinga document similarity component that determines the similarity between two documents by retrieving the content of the two documents and generating a similarity metric based on the retrieved content of the two documents to indicate similarity between the retrieved content of the two documents;

    a build graph component that builds a graph in which nodes representing similar documents are connected via edges, such that each node has an edge to a number of other nodes that are most similar to it;

    a generate weights component that generates weights for the edges based on similarity of the documents represented by the connected nodes such that the weight for an edge between documents whose content are similar is higher than the weight for an edge between documents whose content are not as similar; and

    a normalize weights component that normalizes the weights of the graph so that weights are relative to the weights of the connected documents;

    a propagate relevance component that propagates relevance of the labeled documents to the unlabeled documents based on the similarity between documents as indicated by the weights of the edges of the graph connecting the documents such that an unlabeled document that is similar to labeled documents as represented by the weights of the edges of the graph has its relevance set based on the relevance of those similar labeled documents, the propagation of relevance for a first document includes determining the relevance for the first document based on a summation of weighted relevances of the documents wherein the weighted relevance of a second document to the first document is based on the weight of the edge of the graph connecting the second document to the first document and the relevance of the second document; and

    a training component that trains a ranking function to generate relevance of a document to a query based on the documents, the query, and the labeled and propagated relevanceswherein the components comprise computer-executable instructions for execution by the processor.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×