Contextual ranking of keywords using click data
First Claim
1. A method comprising:
- collecting usage data that indicates how frequently users interact with annotations for entities that are referenced in documents that are presented to the users;
based at least in part on the usage data, generating weights for features that are associated with the entities that are referenced in the documents;
wherein a particular weight of a particular feature is based at least in part on how frequently users interact with annotations of entities having the particular feature;
identifying a set of identified entities within a document;
determining a ranking for the identified entities that belong to said set of identified entities based, at least in part, on(a) feature scores for each of the identified entities, wherein the feature scores correspond to features associated with the identified entities, wherein the particular feature is associated with at least one of the identified entities; and
(b) weights, including the particular weight, for the features that are associated with the identified entities;
based at least in part on the ranking, automatically selecting a subset of the identified entities for annotation, wherein the subset includes fewer than all of the identified entities;
automatically generating an annotated version of the document by, for each entity in the subset, adding to the document a control for displaying additional information about the entity, wherein the additional information about the entity and the control associated with the entity were not in the document before the step of automatically generating the annotated version of the document;
wherein at least the steps of generating the weights, determining the ranking, automatically selecting the subset, and automatically generating the annotated version are performed by one or more computing devices.
9 Assignments
0 Petitions
Accused Products
Abstract
Techniques are provided for ranking the entities that are identified in a document based on an estimated likelihood that a user will actually make use of the annotations. According to one disclosed approach, usage data that indicates how users interact with annotations contained in documents presented to the users is collected. Based on the usage data, weights are generated for features of a feature vector. The weights are then used to modify feature scores of entities, and the modified feature scores are used to determine how to annotate documents. Specifically, a set of entities are identified within a document. A ranking for the identified entities is determined based, at least in part, on (a) feature vector scores for each of the identified entities, and (b) the weights generated for the features of the feature vector. The document is then annotated based, at least in part, on the ranking.
101 Citations
32 Claims
-
1. A method comprising:
-
collecting usage data that indicates how frequently users interact with annotations for entities that are referenced in documents that are presented to the users; based at least in part on the usage data, generating weights for features that are associated with the entities that are referenced in the documents; wherein a particular weight of a particular feature is based at least in part on how frequently users interact with annotations of entities having the particular feature; identifying a set of identified entities within a document; determining a ranking for the identified entities that belong to said set of identified entities based, at least in part, on (a) feature scores for each of the identified entities, wherein the feature scores correspond to features associated with the identified entities, wherein the particular feature is associated with at least one of the identified entities; and (b) weights, including the particular weight, for the features that are associated with the identified entities; based at least in part on the ranking, automatically selecting a subset of the identified entities for annotation, wherein the subset includes fewer than all of the identified entities; automatically generating an annotated version of the document by, for each entity in the subset, adding to the document a control for displaying additional information about the entity, wherein the additional information about the entity and the control associated with the entity were not in the document before the step of automatically generating the annotated version of the document; wherein at least the steps of generating the weights, determining the ranking, automatically selecting the subset, and automatically generating the annotated version are performed by one or more computing devices. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 25, 27, 28, 29)
-
-
11. A method for annotating a document, the method comprising:
-
generating a weight for a particular feature of entities, wherein the weight indicates how well the particular feature predicts whether annotations associated with the entities will be used; identifying a set of entities within the document; generating a first set of scores by generating, for each entity in the set, a score for said particular feature; generating a second set of scores based at least in part on said first set of scores and said weight; establishing a ranking of entities in the set of entities based, at least in part, on the second set of scores; based at least in part on the ranking, automatically selecting a subset of the set of entities for annotation, wherein the subset includes fewer than all of the identified entities; automatically generating an annotated version of the document by, for each entity in the subset, adding to the document a control for displaying additional information about the entity, wherein the additional information about the entity and the control associated with the entity were not in the document before the step of automatically generating the annotated version of the document; wherein at least the steps of generating the weight, generating the second set of scores, automatically selecting the subset, and automatically generating the annotated version are performed by one or more computing devices. - View Dependent Claims (12)
-
-
13. One or more non-transitory computer-readable storage media storing instructions, the instructions including instructions which, when executed by one or more processors, cause the one or more processors to perform the steps of:
-
collecting usage data that indicates how frequently users interact with annotations for entities that are referenced in documents presented to the users; based at least in part on the usage data, generating weights for features that are associated with the entities referenced in the documents; wherein a particular weight of a particular feature is based at least in part on how frequently users interact with annotations of entities having the particular feature; identifying a set of identified entities within a document; determining a ranking for the identified entities that belong to said set of identified identities based, at least in part, on (a) feature scores for each of the identified entities, wherein the feature scores correspond to features associated with the identified entities, wherein the particular feature is associated with at least one of the identified entities; and (b) weights, including the particular weight, for the features that are associated with the identified entities; based at least in part on the ranking, automatically selecting a subset of the identified entities for annotation, wherein the subset includes fewer than all of the identified entities; automatically generating an annotated version of the document by, for each entity in the subset, adding to the document a control for displaying additional information about the entity, wherein the additional information about the entity and the control associated with the entity were not in the document before the step of automatically generating the annotated version of the document. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20, 21, 22, 26, 30, 31, 32)
-
-
23. One or more non-transitory computer-readable storage media storing instructions, the instructions including instructions which, when executed by one or more processors, cause the one or more processors to perform the steps of:
-
generating a weight for a particular feature of entities, wherein the weight indicates how well the particular feature predicts whether annotations associated with the entities will be used; identifying a set of entities within a document; generating a first set of scores by generating, for each entity in the set, a score for said particular feature; generating a second set of scores based at least in part on said first set of scores and said weight; establishing a ranking of the entities in the set of entities based, at least in part, on the second set of scores; based at least in part on the ranking, automatically selecting a subset of the set of entities for annotation, wherein the subset includes fewer than all of the identified entities; automatically generating an annotated version of the document by, for each entity in the subset, adding to the document a control for displaying additional information about the entity, wherein the additional information about the entity and the control associated with the entity were not in the document before the step of automatically generating the annotated version of the document. - View Dependent Claims (24)
-
Specification