Method and system for identifying entities
First Claim
1. A non-transitory machine readable medium storing a program which when executed by at least one processing unit identifies a set of identity attributes for determining the identity of an entity, the program comprising sets of instructions for:
- identifying a particular name that occurs more often than other names in a set of documents;
identifying a plurality of candidate identity attribute sets by analyzing the particular name and at least one document in the set of documents using a plurality of different processes that each identifies (i) a set of candidate identities corresponding to the particular name and (ii) a candidate identity attribute set for each identified candidate identity, wherein at least one of the different processes analyzes a stored plurality of identities to identify candidate identities having the particular name and that are related to an entity to which the at least one document is also related;
for each candidate identity attribute set of the plurality of candidate identity attribute sets, calculating a relevance score for each candidate identity attribute in the set that measures a level of correspondence between the particular name and the candidate identity attribute; and
identifying, based on the relevance scores calculated for the candidate identity attributes of the different candidate identity attribute sets, a particular candidate identity attribute set for a particular identity that corresponds to the particular name.
4 Assignments
0 Petitions
Accused Products
Abstract
Some embodiments provide a program that identifies an entity having an entity attribute. The program receives, from each method of several methods, a set of candidate identity attributes that are each for identifying a particular entity having the entity attribute specified in the document. Each method of the several methods generates the corresponding set of candidate identity attributes based on the entity attribute specified in a document. The program calculates a score for each candidate identity attribute in the sets of candidate identity attributes. The program identifies, based on the sets of scores, an identity attribute from the sets of candidate identity attributes that identifies the entity having the entity attribute specified in the document.
-
Citations
20 Claims
-
1. A non-transitory machine readable medium storing a program which when executed by at least one processing unit identifies a set of identity attributes for determining the identity of an entity, the program comprising sets of instructions for:
-
identifying a particular name that occurs more often than other names in a set of documents; identifying a plurality of candidate identity attribute sets by analyzing the particular name and at least one document in the set of documents using a plurality of different processes that each identifies (i) a set of candidate identities corresponding to the particular name and (ii) a candidate identity attribute set for each identified candidate identity, wherein at least one of the different processes analyzes a stored plurality of identities to identify candidate identities having the particular name and that are related to an entity to which the at least one document is also related; for each candidate identity attribute set of the plurality of candidate identity attribute sets, calculating a relevance score for each candidate identity attribute in the set that measures a level of correspondence between the particular name and the candidate identity attribute; and identifying, based on the relevance scores calculated for the candidate identity attributes of the different candidate identity attribute sets, a particular candidate identity attribute set for a particular identity that corresponds to the particular name. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A method for identifying a set of identity attributes for determining the identity of an entity, the method comprising:
-
identifying a particular name that occurs more often than other names in a set of documents; identifying a plurality of candidate identity attribute sets by analyzing the particular name and at least one document in the set of documents using a plurality of different processes that each identifies (i) a set of candidate identities corresponding to the particular name and (ii) a candidate identity attribute set for each identified candidate identity, wherein at least one of the different processes analyzes a stored plurality of identities to identify candidate identities having the particular name and that are related to an entity to which the at least one document is also related; for each candidate identity attribute set of the plurality of candidate identity attribute sets, calculating a relevance score for each candidate identity attribute in the set that measures a level of correspondence between the particular name and the candidate identity attribute; and identifying, based on the relevance scores calculated for the candidate identity attributes of the different candidate identity attribute sets, a particular candidate identity attribute set for a particular identity that corresponds to the particular name. - View Dependent Claims (10, 11, 12, 13, 14, 15)
-
-
16. A device comprising:
-
a set of processing units; and a non-transitory machine readable medium storing a program which when executed by the set of processing units determines the identity of an entity, the program comprising sets of instructions for; identifying a particular name that occurs more often than other names in a set of documents; identifying a plurality of candidate identity attribute sets by analyzing the particular name and at least one document in the set of documents using a plurality of different processes that each identifies (i) a set of candidate identities corresponding to the particular name and (ii) a candidate identity attribute set for each identified candidate identity, wherein at least one of the different processes analyzes a stored plurality of identities to identify candidate identities having the particular name and that are related to an entity to which the at least one document is also related; for each candidate identity attribute set of the plurality of candidate identity attribute sets, calculating a relevance score for each candidate identity attribute in the set that measures a level of correspondence between the particular name and the candidate identity attribute; and identifying, based on the relevance scores calculated for the candidate identity attributes of the different candidate identity attribute sets, a particular candidate identity attribute set for a particular identity that corresponds to the particular name. - View Dependent Claims (17, 18, 19, 20)
-
Specification