×

Holistic disambiguation for entity name spotting

  • US 8,856,119 B2
  • Filed: 02/27/2009
  • Issued: 10/07/2014
  • Est. Priority Date: 02/27/2009
  • Status: Expired due to Fees
First Claim
Patent Images

1. A computer implemented method for reducing ambiguities in entity name spotting, comprising:

  • performing an entity name spotting process in a data corpus;

    identifying, based on said entity name spotting process, an ambiguous entity name, said ambiguous entity name comprising an entity name corresponding to at least two categorical nodes in a predefined domain of an activation network, said activation network comprising a plurality of predefined categorical nodes, each predefined categorical node having an initial activation level, and each said predefined categorical node having edges established between said predefined categorical nodes, said edges indicating a relationship between said predefined categorical nodes, said relationship between said predefined categorical nodes defining a direction of influence between said predefined categorical nodes, said predefined domain representing at least one predefined category of items of interest within said data corpus;

    determining an updated activation level for each of said categorical nodes in said predefined domain, said updated activation level for each said predefined categorical node in said activation network being based on every edge between said predefined categorical nodes, each said updated activation level for each said predefined categorical node in said activation network being updated based on said relationship between said predefined categorical nodes, said updated activation level being based at least on information in said data corpus and context of said information,wherein for each categorical node, said determining said updated activation level comprising;

    receiving metadata related to entity names in said data corpus to modify said activation level of said categorical nodes in said predefined domain;

    analyzing said metadata to determine related categorical nodes to modify said activation level of said categorical nodes in said predefined domain;

    analyzing semantic information of user comments to modify said activation level of said categorical nodes in said predefined domain; and

    analyzing an ontology customized for said activation network to modify said activation level of said categorical nodes in said predefined domain,said entity names comprising one of proper names, credentials and identifications;

    selecting a most activated categorical node of said categorical nodes in said predefined domain, said most activated categorical node having a highest updated activation level of each categorical node;

    assigning said ambiguous entity name to said most activated categorical node; and

    outputting said most activated categorical node to a user to replace said ambiguous entity name.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×