Systems and methods for identifying entity mentions referencing a same real-world entity
First Claim
Patent Images
1. A method for identifying entity mentions referencing a same real-world entity, the method including:
- selecting one or more core entity attributes that represent a real-world entity as a first search attribute set for use in searching biographical sources, including in the selection applying one or more probability distribution functions or joint probability distribution functions to estimate resulting cohort size;
generating one or more searches for processing by a plurality of biographical sources using the first search attribute set;
electronically receiving, responsive to the first search attribute set, entity reflections that include supplemental entity attributes for the real-world entity;
combining the core and supplemental attributes in an anchor entity candidate data object with extended entity attributes that represent the real-world entity;
selecting one or more extended entity attributes as a second search attribute set for use in searching web sources, including applying one or more further probability distribution functions or joint probability distribution functions to estimate resulting cohort size,generating one or more further web searches using the second search attribute set;
electronically receiving, responsive to the second search attribute set, more entity reflections that include meta entity attributes for the real-world entity; and
updating the anchor entity candidate to include one or more of the meta entity attributes.
1 Assignment
0 Petitions
Accused Products
Abstract
The technology disclosed relates to identifying entity reflections that refer to a same real-world entity. In particular, it relates to using statistical functions to make probabilistic deductions about entity attributes, which are used to construct optimal combinations of entity attributes. These optimal combinations of entity attributes are further used to generate search queries that return more precise search results with greater recall.
158 Citations
25 Claims
-
1. A method for identifying entity mentions referencing a same real-world entity, the method including:
-
selecting one or more core entity attributes that represent a real-world entity as a first search attribute set for use in searching biographical sources, including in the selection applying one or more probability distribution functions or joint probability distribution functions to estimate resulting cohort size; generating one or more searches for processing by a plurality of biographical sources using the first search attribute set; electronically receiving, responsive to the first search attribute set, entity reflections that include supplemental entity attributes for the real-world entity; combining the core and supplemental attributes in an anchor entity candidate data object with extended entity attributes that represent the real-world entity; selecting one or more extended entity attributes as a second search attribute set for use in searching web sources, including applying one or more further probability distribution functions or joint probability distribution functions to estimate resulting cohort size, generating one or more further web searches using the second search attribute set; electronically receiving, responsive to the second search attribute set, more entity reflections that include meta entity attributes for the real-world entity; and updating the anchor entity candidate to include one or more of the meta entity attributes. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A method for connecting entity reflections to real-world entities in an ambiguous environment, the method including:
-
selecting one or more core entity attributes that represent a real-world entity as a first search attribute set for use in searching biographical sources, including in the selection applying one or more probability distribution functions or joint probability distribution functions to estimate resulting cohort size; generating one or more searches for processing by a plurality of biographical sources using the first search attribute set; electronically receiving, responsive to the first search attribute set, entity reflections that include supplemental entity attributes for the real-world entity; calculating attribute scores for supplemental attributes using a probability contribution function, wherein the attribute scores specify a quantitative assessment of similarity between the supplemental attributes and the core attributes; merging supplemental attributes with attributes scores above a predefined threshold with core attributes in an anchor entity candidate data object with extended entity attributes that represent the real-world entity; selecting one or more extended entity attributes as a second search attribute set for use in searching web sources, including applying one or more further probability distribution functions or joint probability distribution functions to estimate resulting cohort size, generating one or more further web searches using the second search attribute set; electronically receiving, responsive to the second search attribute set, more entity reflections that include meta entity attributes for the real-world entity; calculating attribute scores for meta entity attributes using a probability contribution function, wherein the attribute scores specify a quantitative assessment of similarity between the meta entity attributes and the extended entity attributes; and updating the anchor entity candidate to include one or more of the meta entity attributes with attribute scores above the predefined threshold. - View Dependent Claims (8, 9, 10, 11, 12, 13, 14)
-
-
15. A computer system for identifying entity mentions referencing a same real-world entity, the system including:
a processor and a computer readable storage medium storing computer instructions configured to cause the processor to; select one or more core entity attributes that represent a real-world entity as a first search attribute set for use in searching biographical sources, including in the selection applying one or more probability distribution functions or joint probability distribution functions to estimate resulting cohort size; generate one or more searches for processing by a plurality of biographical sources using the first search attribute set; electronically receive, responsive to the first search attribute set, entity reflections that include supplemental entity attributes for the real-world entity; combine the core and supplemental attributes in an anchor entity candidate data object with extended entity attributes that represent the real-world entity; select one or more extended entity attributes as a second search attribute set for use in searching web sources, including applying one or more further probability distribution functions or joint probability distribution functions to estimate resulting cohort size; generate one or more further web searches using the second search attribute set, electronically receive, responsive to the second search attribute set, more entity reflections that include meta entity attributes for the real-world entity; and update the anchor entity candidate to include one or more of the meta entity attributes. - View Dependent Claims (16, 17, 18, 19, 20)
-
21. A computer system for connecting entity reflections to real-world entities in an ambiguous environment, the system including:
a processor and a computer readable storage medium storing computer instructions configured to cause the processor to; select one or more core entity attributes that represent a real-world entity as a first search attribute set for use in searching biographical sources, including in the selection applying one or more probability distribution functions or joint probability distribution functions to estimate resulting cohort size; generate one or more searches for processing by a plurality of biographical sources using the first search attribute set; electronically receive, responsive to the first search attribute set, entity reflections that include supplemental entity attributes for the real-world entity; calculate attribute scores for supplemental attributes using a probability contribution function, wherein the attribute scores specify a quantitative assessment of similarity between the supplemental attributes and the core attributes; merge supplemental attributes with attributes scores above a predefined threshold with core attributes in an anchor entity candidate data object with extended entity attributes that represent the real-world entity; select one or more extended entity attributes as a second search attribute set for use in searching web sources, including applying one or more further probability distribution functions or joint probability distribution functions to estimate resulting cohort size; generate one or more further web searches using the second search attribute set; electronically receive, responsive to the second search attribute set, more entity reflections that include meta entity attributes for the real-world entity; calculate attribute scores for meta entity attributes using a probability contribution function, wherein the attribute scores specify a quantitative assessment of similarity between the meta entity attributes and the extended entity attributes; and update the anchor entity candidate to include one or more of the meta entity attributes with attribute scores above the predefined threshold. - View Dependent Claims (22, 23, 24, 25)
Specification