CONTEXTUAL RANKING OF KEYWORDS USING CLICK DATA
First Claim
1. A method comprising:
- collecting usage data that indicates how users interact with annotations contained in documents presented to the users, wherein said annotations are associated with entities contained in the documents;
based on the usage data, generating weights for features of a feature vector;
identifying a set of identified entities within a document;
determining a ranking for the identified entities that belong to said set of identified identities based, at least in part, on(a) feature vector scores for each of the identified entities, wherein the feature vector scores correspond to features in the feature vector; and
(b) the weights generated for the features of the feature vector.
9 Assignments
0 Petitions
Accused Products
Abstract
Techniques are provided for ranking the entities that are identified in a document based on an estimated likelihood that a user will actually make use of the annotations. According to one disclosed approach, usage data that indicates how users interact with annotations contained in documents presented to the users is collected. Based on the usage data, weights are generated for features of a feature vector. The weights are then used to modify feature scores of entities, and the modified feature scores are used to determine how to annotate documents. Specifically, a set of entities are identified within a document. A ranking for the identified entities is determined based, at least in part, on (a) feature vector scores for each of the identified entities, and (b) the weights generated for the features of the feature vector. The document is then annotated based, at least in part, on the ranking.
151 Citations
28 Claims
-
1. A method comprising:
-
collecting usage data that indicates how users interact with annotations contained in documents presented to the users, wherein said annotations are associated with entities contained in the documents; based on the usage data, generating weights for features of a feature vector; identifying a set of identified entities within a document; determining a ranking for the identified entities that belong to said set of identified identities based, at least in part, on (a) feature vector scores for each of the identified entities, wherein the feature vector scores correspond to features in the feature vector; and (b) the weights generated for the features of the feature vector. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A method for annotating a document, the method comprising:
-
generating a weight for a particular feature, wherein the weight indicates how well the particular feature predicts whether annotations associated with entities will be used; identifying a set of entities within the document; generating a first set of scores by generating, for each entity in the set, a score for said particular feature; generating a second set of scores based on said first set of scores and said weight; establishing a ranking of the entities in the set of entities based, at least in part, on the second set of scores; and annotating one or more entities in the document based, at least in part, on said ranking. - View Dependent Claims (14)
-
-
15. A computer-readable storage medium storing instructions, the instructions including instructions which, when executed by one or more processors, cause the one or more processors to perform the steps of:
-
collecting usage data that indicates how users interact with annotations contained in documents presented to the users, wherein said annotations are associated with entities contained in the documents; based on the usage data, generating weights for features of a feature vector; identifying a set of identified entities within a document; determining a ranking for the identified entities that belong to said set of identified identities based, at least in part, on (a) feature vector scores for each of the identified entities, wherein the feature vector scores correspond to features in the feature vector; and (b) the weights generated for the features of the feature vector. - View Dependent Claims (16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26)
-
-
27. A computer-readable storage medium storing instructions, the instructions including instructions which, when executed by one or more processors, cause the one or more processors to perform the steps of:
-
generating a weight for a particular feature, wherein the weight indicates how well the particular feature predicts whether annotations associated with entities will be used; identifying a set of entities within a document; generating a first set of scores by generating, for each entity in the set, a score for said particular feature; generating a second set of scores based on said first set of scores and said weight; establishing a ranking of the entities in the set of entities based, at least in part, on the second set of scores; and annotating one or more entities in the document based, at least in part, on said ranking. - View Dependent Claims (28)
-
Specification