Providing an explanation of a missing fact estimate
First Claim
Patent Images
1. A method comprising:
- determining, using at least one processor, that information for an entity is absent from a data graph;
determining, using the at least one processor, an estimate for the information based on a plurality of features from a joint distribution model related to the information;
adding the estimate to the data graph so that the estimate is linked to the entity via a relationship indicating that the estimate is not verified;
selecting a subset of the plurality of features, wherein the subset is a first subset and selecting the subset includes;
determining a contribution value for each of the plurality of features;
determining that a second subset of the features are related based on clustering the entity for which the information is missing with other entities having a similar feature;
aggregating the features in the second subset to generate an aggregate feature;
aggregating the contribution values for the features in the second subset to generate a new contribution score; and
selecting the second subset as the first subset;
for each feature in the subset of the plurality of features;
adding the feature in the data graph, and linking the feature to the estimate;
receiving, using the at least one processor, a query that requests the information for the entity;
generating an explanation based on the subset of features linked to the estimate in the data graph, wherein the explanation and the estimate are based on the aggregate feature and the new contribution score; and
providing the explanation and the estimate as part of a search result for the query.
2 Assignments
0 Petitions
Accused Products
Abstract
Systems and methods are disclosed for providing an explanation of an estimate for information missing from a data graph. An example method may include receiving a query that requests information for a first entity and receiving an estimate for the information, the estimate being based on a plurality of features of a joint distribution model. The method may include determining respective contribution scores for the plurality of features, selecting a quantity of the features with highest contribution scores, generating, using the selected quantity of features, an explanation for the estimate; and providing the explanation and the estimate as part of a search result for the query.
-
Citations
18 Claims
-
1. A method comprising:
-
determining, using at least one processor, that information for an entity is absent from a data graph; determining, using the at least one processor, an estimate for the information based on a plurality of features from a joint distribution model related to the information; adding the estimate to the data graph so that the estimate is linked to the entity via a relationship indicating that the estimate is not verified; selecting a subset of the plurality of features, wherein the subset is a first subset and selecting the subset includes;
determining a contribution value for each of the plurality of features;
determining that a second subset of the features are related based on clustering the entity for which the information is missing with other entities having a similar feature;
aggregating the features in the second subset to generate an aggregate feature;
aggregating the contribution values for the features in the second subset to generate a new contribution score; and
selecting the second subset as the first subset;for each feature in the subset of the plurality of features;
adding the feature in the data graph, and linking the feature to the estimate;receiving, using the at least one processor, a query that requests the information for the entity; generating an explanation based on the subset of features linked to the estimate in the data graph, wherein the explanation and the estimate are based on the aggregate feature and the new contribution score; and providing the explanation and the estimate as part of a search result for the query. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A computer program product tangibly embodied in a non-transitory storage medium, the computer program product including instructions that when executed cause a processor to perform operations including:
-
determining, using at least one processor, that information for an entity is absent from a data graph; determining, using the at least one processor, an estimate for the information based on a plurality of features from a joint distribution model related to the information; adding the estimate to the data graph so that the estimate is linked to the entity via a relationship indicating that the estimate is not verified; selecting a subset of the plurality of features, wherein the subset is a first subset and selecting the subset includes;
determining a contribution value for each of the plurality of features;
determining that a second subset of the features are related based on clustering the entity for which the information is missing with other entities having a similar feature;
aggregating the features in the second subset to generate an aggregate feature;
aggregating the contribution values for the features in the second subset to generate a new contribution score; and
selecting the second subset as the first subset;for each feature in the subset of the plurality of features;
adding the feature in the data graph, and linking the feature to the estimate;receiving, using the at least one processor, a query that requests the information for the entity; generating an explanation based on the subset of features linked to the estimate in the data graph, wherein the explanation and the estimate are based on the aggregate feature and the new contribution score; and providing the explanation and the estimate as part of a search result for the query. - View Dependent Claims (8, 9, 10, 11, 12)
-
-
13. A system comprising:
-
a processor; and a computer program product tangibly embodied in a non-transitory storage medium, the computer program product including instructions that when executed cause the processor to perform operations including; determining, using at least one processor, that information for an entity is absent from a data graph; determining, using the at least one processor, an estimate for the information based on a plurality of features from a joint distribution model related to the information; adding the estimate to the data graph so that the estimate is linked to the entity via a relationship indicating that the estimate is not verified; selecting a subset of the plurality of features, wherein the subset is a first subset and selecting the subset includes;
determining a contribution value for each of the plurality of features;
determining that a second subset of the features are related based on clustering the entity for which the information is missing with other entities having a similar feature;
aggregating the features in the second subset to generate an aggregate feature;
aggregating the contribution values for the features in the second subset to generate a new contribution score; and
selecting the second subset as the first subset;for each feature in the subset of the plurality of features;
adding the feature in the data graph, and linking the feature to the estimate;receiving, using the at least one processor, a query that requests the information for the entity; generating an explanation based on the subset of features linked to the estimate in the data graph, wherein the explanation and the estimate are based on the aggregate feature and the new contribution score; and providing the explanation and the estimate as part of a search result for the query. - View Dependent Claims (14, 15, 16, 17, 18)
-
Specification