Disambiguation and Tagging of Entities
First Claim
1. A method comprising:
- identifying, at a processor, a candidate entity in a content item, wherein the candidate entity is a potential match with a first known entity identified in a memory;
performing a first categorization of the candidate entity based on a first set of factors, wherein the first categorization generates one or more decisions regarding at least one other candidate entity following the candidate entity in the content item;
performing a second categorization of the candidate entity, after the first categorization, based on a second set of factors, wherein the second set of factors is different from the first set of factors and includes the one or more decisions regarding the at least one other candidate entity;
determining, after the second categorization, that the candidate entity is categorized with a plurality of known entities;
in response to determining that the candidate entity is categorized with the plurality of known entities, disambiguating the candidate entity; and
tagging the candidate entity based on the disambiguation.
1 Assignment
0 Petitions
Accused Products
Abstract
Tagging of content items and entities identified therein may include a matching process, a classification process and a disambiguation process. Matching may include the identification of potential matching candidate entities in a content item whereas the classification process may categorize or group identified candidate entities according to known entities to which they are likely a match. In some instances, a candidate entity may be categorized with multiple known entities. Accordingly, a disambiguation process may be used to reduce the potential matches to a single known entity. In one example, the disambiguation process may include ranking potentially matching known entities according to a hierarchy of criteria.
153 Citations
35 Claims
-
1. A method comprising:
-
identifying, at a processor, a candidate entity in a content item, wherein the candidate entity is a potential match with a first known entity identified in a memory; performing a first categorization of the candidate entity based on a first set of factors, wherein the first categorization generates one or more decisions regarding at least one other candidate entity following the candidate entity in the content item; performing a second categorization of the candidate entity, after the first categorization, based on a second set of factors, wherein the second set of factors is different from the first set of factors and includes the one or more decisions regarding the at least one other candidate entity; determining, after the second categorization, that the candidate entity is categorized with a plurality of known entities; in response to determining that the candidate entity is categorized with the plurality of known entities, disambiguating the candidate entity; and tagging the candidate entity based on the disambiguation. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A method comprising:
-
identifying, at a processor, a candidate entity in a content item, wherein the candidate entity is a potential match with a first known entity identified in a memory; categorizing the candidate entity based on one or more decisions made regarding at least one other candidate entity preceding the candidate entity in the content item; determining that the candidate entity is categorized with a plurality of known entities including the first known entity and a second known entity; ranking the first known entity and the second known entity based on a first match reliability criterion; determining that the first known entity is ranked lower than the second known entity; and removing the first known entity as a potential match with the candidate entity. - View Dependent Claims (15, 16, 17, 18)
-
-
19. An apparatus comprising:
-
a processor; and memory storing computer readable instructions that, when executed, cause the apparatus to; identify a candidate entity in a content item, wherein the candidate entity is a potential match with a first known entity identified in the memory; perform a first categorization of the candidate entity based on a first set of factors, wherein the first categorization generates one or more decisions regarding at least one other candidate entity following the candidate entity in the content item; perform a second categorization of the candidate entity, after the first categorization, based on a second set of factors, wherein the second set of factors is different from the first set of factors and includes the one or more decisions regarding the at least one other candidate entity; determine, after the second categorization, that the candidate entity is categorized with a plurality of known entities; in response to determining that the candidate entity is categorized with the plurality of known entities, disambiguate the candidate entity; and tag the candidate entity based on the disambiguation. - View Dependent Claims (20, 21, 22, 23, 24)
-
-
25. An apparatus comprising:
-
a processor; and memory storing computer readable instructions that, when executed, cause the apparatus to; identify a candidate entity in a content item, wherein the candidate entity is a potential match with a first known entity identified in the memory; categorize the candidate entity based on one or more decisions made regarding at least one other candidate entity preceding the candidate entity in the content item; determine that the candidate entity is categorized with a plurality of known entities including the first known entity and a second known entity; rank the first known entity and the second known entity based on a first match reliability criterion; determine that the first known entity is ranked lower than the second known entity; and remove the first known entity as a potential match with the candidate entity. - View Dependent Claims (26, 27, 28, 29)
-
-
30. One or more computer readable media storing computer readable instructions that, when executed, cause an apparatus to:
-
identify, at the apparatus, a candidate entity in a content item, wherein the candidate entity is a potential match with a first known entity identified in a memory; perform a first categorization of the candidate entity based on a first set of factors, wherein the first categorization generates one or more decisions regarding at least one other candidate entity following the candidate entity in the content item; perform a second categorization of the candidate entity, after the first categorization, based on a second set of factors, wherein the second set of factors is different from the first set of factors and includes the one or more decisions regarding the at least one other candidate entity; determine, after the second categorization, that the candidate entity is categorized with a plurality of known entities; in response to determining that the candidate entity is categorized with the plurality of known entities, disambiguate the candidate entity; and tag the candidate entity based on the disambiguation. - View Dependent Claims (31, 32)
-
-
33. One or more computer readable media storing computer readable instructions that, when executed, cause an apparatus to:
-
identify a candidate entity in a content item, wherein the candidate entity is a potential match with a first known entity identified in the memory; categorize the candidate entity based on one or more decisions made regarding at least one other candidate entity preceding the candidate entity in the content item; determine that the candidate entity is categorized with a plurality of known entities including the first known entity and a second known entity; rank the first known entity and the second known entity based on a first match reliability criterion; determine that the first known entity is ranked lower than the second known entity; and remove the first known entity as a potential match with the candidate entity. - View Dependent Claims (34, 35)
-
Specification