×

Disambiguating authors in social media communications

  • US 9,524,526 B2
  • Filed: 04/17/2012
  • Issued: 12/20/2016
  • Est. Priority Date: 04/17/2012
  • Status: Active Grant
First Claim
Patent Images

1. A method for mapping authors across multiple social media forums, the method comprising:

  • creating a database that contains publicly observable information pertaining to multiple authors from multiple social media forums;

    generating a mapping between at least a first one of the authors from a first of the social media forums and at least a second one of the authors from a second of the social media forums in the database based on a comparison of structured information comprising one or more identification details associated with a given author, unstructured user generated content information comprising one or more portions of written content generated by the given author, and network information;

    refining the mapping by;

    comparing a friend list associated with the first author on the first social media forum and a friend list associated with the second author on the second social media forum to identify one or more overlapping friend list entries; and

    comparing content of authors from each mapping by extracting information from written author content on a given social media forum, matching written author content across the multiple social media forums, and assigning a discrete weight to each item of written author content,wherein the written author content comprises mention of a named entity, a person'"'"'s name, a telephone number, an email address, a uniform resource locator (URL), a location, a noun, a synonym of the noun, and a spelling variant of the noun, wherein each discrete weight defines an amount of relevance that the corresponding item of written author content has in connection with a task of matching two authors, wherein a higher weight indicates a higher amount of relevance, and wherein mention of a person'"'"'s name is assigned a higher weight than mention of a noun, and mention of a noun is assigned a higher weight than mention of a synonym of the noun;

    generating a score for the refined mapping between the first and the second authors by calculating;

    a weighted sum of the number of times the structured information, the unstructured user generated content information and the network information match between the first and the second authors, wherein calculating the weighted sum comprises applying relative weightage, pre-determined by a user, to each item of structure information, unstructured user generated content information and network information, and adjusting the applied relative weightage based upon a correspondence to an exact match of given items of information versus a synonym matching of given items of information, wherein an exact match of given items of information results in an increased relative weightage adjustment applied thereto, and wherein a synonym matching of the given items of information results in a decreased relative weightage adjustment applied thereto, andthe number of identified overlapping friend list entries associated with the first and the second authors,wherein the relative weightage denotes the relative importance of each given item of information; and

    determining, based on said generated score, that the first and the second authors are the same person;

    wherein the steps are carried out by at least one computing device.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×