Ranking documents based on user behavior and/or feature data
First Claim
Patent Images
1. A method performed by one or more server devices, comprising:
- storing, in a memory associated with the one or more server devices, feature data associated with a plurality of first links, within a plurality of first source documents, that point to a plurality of first target documents,the feature data, for one of the plurality of first links, including one or more features of one of the plurality of first source documents that contains the one of the plurality of links, one or more features of one of the plurality of first target documents that is pointed to by the one of the plurality of links, and one or more features of the one of the plurality of first links;
storing, in a memory associated with the one or more server devices, user behavior data relating to user navigational activity with regard to the plurality of first source documents accessed by one or more users and the plurality of first links within the plurality of first source documents selected by the one or more users;
training, using one or more processors of the one or more server devices and based on the feature data and the user behavior data, a model that identifies a probability that a particular link, with particular feature data, will be selected by a user, where training the model includes;
analyzing the feature data associated with each of the plurality of first links that was selected by the one or more users and the feature data associated with each of the plurality of first links that was not selected by the one or more users to generate rules for the model;
identifying, by one or more processors associated with the one or more server devices, a plurality of second links, within a plurality of second source documents, that point to a plurality of second target documents;
determining, using one or more processors associated with the one or more server devices, feature data associated with each of the plurality of second links,the feature data, associated with one of the plurality of second links, including one or more features of the one of the plurality of second links, one or more features of one of the plurality of second source documents that contains the one of the plurality of second links, and one or more features of the one of the plurality of second target documents that is pointed to by the one of the plurality of second links;
determining, using the model and based on the feature data, a probability that each of the plurality of second links will be selected by a user, where the determining includes;
inputting, into the model, the feature data associated with the one of the plurality of second links, andoutputting, by the model, the probability that the one of the plurality of second links will be selected by a user;
calculating, using one or more processors associated with the one or more server devices, a rank for a particular target document of the plurality of second target documents based on the probability associated with one or more of the plurality of second links that point to the particular target document; and
ordering the particular target document, with regard to at least one other document, based on the rank for the particular target document.
2 Assignments
0 Petitions
Accused Products
Abstract
A system generates a model based on feature data relating to different features of a link from a linking document to a linked document and user behavior data relating to navigational actions associated with the link. The system also assigns a rank to a document based on the model.
-
Citations
19 Claims
-
1. A method performed by one or more server devices, comprising:
-
storing, in a memory associated with the one or more server devices, feature data associated with a plurality of first links, within a plurality of first source documents, that point to a plurality of first target documents, the feature data, for one of the plurality of first links, including one or more features of one of the plurality of first source documents that contains the one of the plurality of links, one or more features of one of the plurality of first target documents that is pointed to by the one of the plurality of links, and one or more features of the one of the plurality of first links; storing, in a memory associated with the one or more server devices, user behavior data relating to user navigational activity with regard to the plurality of first source documents accessed by one or more users and the plurality of first links within the plurality of first source documents selected by the one or more users; training, using one or more processors of the one or more server devices and based on the feature data and the user behavior data, a model that identifies a probability that a particular link, with particular feature data, will be selected by a user, where training the model includes; analyzing the feature data associated with each of the plurality of first links that was selected by the one or more users and the feature data associated with each of the plurality of first links that was not selected by the one or more users to generate rules for the model; identifying, by one or more processors associated with the one or more server devices, a plurality of second links, within a plurality of second source documents, that point to a plurality of second target documents; determining, using one or more processors associated with the one or more server devices, feature data associated with each of the plurality of second links, the feature data, associated with one of the plurality of second links, including one or more features of the one of the plurality of second links, one or more features of one of the plurality of second source documents that contains the one of the plurality of second links, and one or more features of the one of the plurality of second target documents that is pointed to by the one of the plurality of second links; determining, using the model and based on the feature data, a probability that each of the plurality of second links will be selected by a user, where the determining includes; inputting, into the model, the feature data associated with the one of the plurality of second links, and outputting, by the model, the probability that the one of the plurality of second links will be selected by a user; calculating, using one or more processors associated with the one or more server devices, a rank for a particular target document of the plurality of second target documents based on the probability associated with one or more of the plurality of second links that point to the particular target document; and ordering the particular target document, with regard to at least one other document, based on the rank for the particular target document. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 19)
-
-
12. A method performed by one or more server devices, comprising:
-
storing, in one or more memories associated with the one or more server devices, feature data associated with a plurality of first links within a plurality of first source documents that point to a plurality of first target documents, the feature data including features of the first source documents, features of the first target documents, and features of the first links; storing, in one or more memories associated with the one or more server devices, user behavior data relating to user navigational activity with regard to the first links within the first source documents selected by one or more users; training, using one or more processors associated with the one or more server devices and based on the feature data associated with the feature data associated with the first links and the user behavior data relating to the first links, a model that identifies a probability that a particular link will be selected by a user, where training the model includes; analyzing the feature data associated with the first links that were selected by the one or more users and the feature data associated with the first links that were not selected by the one or more users to generate rules for the model; identifying a plurality of second links within a plurality of second source documents that point to a plurality of second target documents; determining feature data associated with the second links, the feature data associated with the second links including features of the second source documents, features of the second target documents, and features of the second links; determining, using the model, a probability that each of the second links will be selected using only the feature data associated with the second link as input to the model; assigning a weight to each of the second links based on the probability that the second link will be selected; assigning a rank to one of the second target documents based on the weights assigned to the second links that point to the one of the second target documents; and ordering the one of the second target documents, with regard to at least one other document, based on the rank assigned to the one of the second target documents. - View Dependent Claims (13, 14, 15)
-
-
16. One or more server devices, comprising:
-
means for storing, in a memory, feature data associated with a plurality of links within source documents that point to target documents, the feature data including data associated with features of the source documents, data associated with features of the links, and data associated with features of the target documents, the data associated with the features of one of the source documents including at least one of an entire address of the source document, a portion of the address of the source document, information regarding a web site associated with the source document, a number of links in the source document, presence of words in the source document, presence of words in a heading of the source document, a topical cluster with which the source document is associated, or a degree to which a topical cluster associated with the source document matches a topical cluster associated with a link, the data associated with the features of one of the links including at least one of a font size of anchor text associated with the link, a position of the link within a source document, a position of the link in a list, a font color associated with the link, attributes of the link, a number of words in the anchor text associated with the link, actual words in the anchor text associated with the link, a determination of commerciality of the anchor text associated with the link, a type of the link, a context of words before or after the link, a topical cluster with which the anchor text of the link is associated, whether the link leads to a target document on a same host or domain, or whether an address associated with the link embeds another address, and the data associated with the features of one of the target documents including at least one of an entire address of the target document, a portion of the address of the target document, information regarding a web site associated with the target document, whether the address of the target document is on a same host as an address of a source document that links to the target document, whether the address of the target document is associated with a same domain as the address of the source document, words in the address of the target document, or a length of the address of the target document; means for storing, in a memory, user behavior data relating to user navigational activity with regard to the source documents accessed by one or more users and the links within the source documents selected by the one or more users and the links within the source documents that were not selected by the one or more users; means for training, based on the feature data and instances where the links were selected by the one or more users and instances where the links were not selected by the one or more users, a model that identifies a probability that a link, with particular feature data, will be selected by a user, where the means for training includes; means for analyzing the feature data associated with the links that were selected by the one or more users and the feature data associated with the links that were not selected by the one or more users to generate rules for the model; means for identifying a particular link within a first document that points to a second document; means for determining the feature data associated with the particular link; means for determining, based on inputting the feature data into the model, a probability that the particular link will be selected by a user; means for assigning a weight to the particular link based on the probability that the particular link will be selected; means for assigning a rank to the second document based on the weight assigned to the particular link; and means for ordering the second document, with respect to at least one other document, based on the assigned rank. - View Dependent Claims (17, 18)
-
Specification