Person disambiguation using name entity extraction-based clustering
First Claim
1. In a computing environment, a method comprising, disambiguating person data located from one or more sets of search results, including extracting information about a person based on name entity extraction, and calculating similarity data.
2 Assignments
0 Petitions
Accused Products
Abstract
Described is a technology for disambiguating data corresponding to persons that are located from search results, so that different persons having the same name can be clearly distinguished. Name entity extraction locates words (terms) that are within a certain distance of persons'"'"' names in the search results. The terms are used in disambiguating search results that correspond to different persons having the same name, such as location information, organization information, career information, and/or partner information. In one example, each person is represented as a vector, and similarity among vectors is calculated based on weighting that corresponds to nearness of the terms to a person, and/or the types of terms. Based on the similarity data, the person vectors that represent the same person are then merged into one cluster, so that each cluster represents (to a high probability) only one distinct person.
44 Citations
20 Claims
- 1. In a computing environment, a method comprising, disambiguating person data located from one or more sets of search results, including extracting information about a person based on name entity extraction, and calculating similarity data.
- 15. A computer-readable medium having computer executable instructions, which when executed perform steps, comprising, disambiguating person data located from one or more sets of search results, including extracting information about a person based on name entity extraction and computing vectors for that persons, calculating similarity data from each vector, and clustering person vectors that are similar into clusters based on the similarity data.
-
18. In a computing environment, a system comprising:
-
an extraction mechanism that determines terms within a distance of a person'"'"'s name within search results; and a person disambiguation mechanism, coupled to the extraction mechanism, that uses data based on the terms to cluster data for each common person into a cluster, such that each cluster represents a distinct person. - View Dependent Claims (19, 20)
-
Specification