×

METHOD FOR DISAMBIGUATED FEATURES IN UNSTRUCTURED TEXT

  • US 20150154286A1
  • Filed: 12/02/2014
  • Published: 06/04/2015
  • Est. Priority Date: 12/02/2013
  • Status: Active Grant
First Claim
Patent Images

1. A method comprising:

  • searching, by a node of a system hosting an in-memory database, a set of candidate records to identify one or more candidates matching one or more extracted features, wherein an extracted feature that matches a candidate is a primary feature;

    associating, by the node, each of the extracted features with one or more machine-generated topic identifiers (“

    topic IDs”

    );

    disambiguating, by the node, each of the primary features from one another based on relatedness of topic IDs;

    identifying, by the node, a set of secondary features associated with each primary feature based upon the relatedness of topic IDs;

    disambiguating, by the node, each of the primary features from each of the secondary features in the associated set of secondary features based on relatedness of topic IDs;

    linking, by the node, each primary feature to the associated set of secondary features to form a new cluster;

    determining, by the node, whether the new cluster matches an existing knowledgebase cluster, wherein,when there is a match, determining, by the disambiguation module of the in-memory database server computer, an existing unique identifier (“

    unique ID”

    ) corresponding to each matching primary feature in the knowledgebase cluster and updating the knowledgebase cluster to include the new cluster; and

    when there is no match, creating, by the node, a new knowledgebase cluster and assigning a new unique ID to the primary feature of the new knowledgebase cluster; and

    transmitting, by the node, one of the existing unique ID and the new unique ID for the primary feature.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×