×

Selection of atoms for search engine retrieval

  • US 9,342,582 B2
  • Filed: 03/10/2011
  • Issued: 05/17/2016
  • Est. Priority Date: 11/22/2010
  • Status: Active Grant
First Claim
Patent Images

1. A method for populating one or more search indexes with atoms identified in a plurality of documents, the method comprising:

  • identifying a set of documents to be indexed in a search index;

    for each document in the set of documents, identifying a plurality of atoms, the plurality of atoms comprising one or more unigrams, one or more n-grams, and one or more n-tuples;

    based on the identified set of documents and the plurality of atoms, generating a list of atom/document pairs;

    computing an information metric for each atom/document pair, wherein the information metric represents a pre-computed ranking of the atom used during a search query in relation to the particular document;

    based on the information metric for each atom/document pair, selecting a subset of the atom/document pairs that are most relevant to the particular document from which the atoms were identified;

    populating the search index using the subset of the atom/document pairs for the particular document, wherein identifying relevant documents for the search query from the search index is based on a pruning algorithm that computes a preliminary score for each of the documents to select a subset of the set of documents based on the preliminary score, wherein the preliminary score is computed using the information metric pre-computed for each atom/document pair and a simplified scoring function that approximates a final ranking algorithm utilized in identifying the relevant documents.

View all claims
  • 3 Assignments
Timeline View
Assignment View
    ×
    ×