×

System and method for indexing web content using click-through features

  • US 7,647,314 B2
  • Filed: 04/28/2006
  • Issued: 01/12/2010
  • Est. Priority Date: 04/28/2006
  • Status: Active Grant
First Claim
Patent Images

1. A method for indexing content items based on click-through features, the method comprising:

  • generating a training set comprising one or more query-content item pairs, wherein a given query-content item pair has one or more click-through features associated therewith, the one or more click-through features including two or more of an average amount of time users stay on a website associated with a given URL, a spam score of the given URL, expected clicks at a position of the given URL in a given search results page and frequency of a query in a query log;

    labeling one or more query-content item pairs in the training set by assigning click score thereto based on the one or more click-through features thereof, wherein labeling a given content item in the training set comprises providing a given query-content item pair to a human judge to assign a click score;

    determining a click score function using a loss function based on the click scores of the labeled query-content item pairs and the click-through features thereof;

    applying the click score function to a plurality of unlabeled query-content item pairs to determine click scores thereof based on the one or more click-through features of the unlabeled query-content item pairs;

    generating an inverted click-through index of the unlabeled query-content item pairs and the associated query-score pairs, wherein a key to the index is a URL of the content item; and

    combining the inverted click-through index with a content index by associating the unlabeled query-content item pairs with content items in the content index.

View all claims
  • 9 Assignments
Timeline View
Assignment View
    ×
    ×