×

Method and system for robust tagging of named entities in the presence of source or translation errors

  • US 10,073,673 B2
  • Filed: 07/14/2014
  • Issued: 09/11/2018
  • Est. Priority Date: 07/14/2014
  • Status: Active Grant
First Claim
Patent Images

1. An electronic device comprising:

  • a storage device configured to store a plurality of named entities collected from a plurality of sources, wherein each of the named entities are tokenized into a common format of named entity tokens, wherein each of the named entities are associated with a label, and wherein each of the named entity tokens are one of a word or a syllable of a word; and

    one or more processors configured to convert one or more textual communications from a natural language source into a computer readable format for reading and processing by the electronic device, the one or more processors comprising a tagging apparatus configured to;

    receive the one or more textual communications,identify each of the one or more textual communications,tokenize the one or more textual communications into a common format of textual tokens corresponding to a prefix tree,match, as a function of a selection of a cluster on a deepest level of a multi-level harmonized clustering structure of tokens, the textual tokens with one or more of the named entity tokens stored in the storage device, in order to assign the textual tokens to the labels associated with each of the named entities,tag the one or more textual communications based on the matching between the textual tokens and the named entity tokens, in order to identify an intended meaning of each of the one or more textual communications, wherein the intended meaning of the received one or more textual communications are identified based on an extracted conceptual concept from the one or more textual communications,identify the intended meaning of the one or more textual communications based on applying the tags to the one or more textual communications; and

    replace an error of the one or more converted textual communications with the identified intended meaning of the one or more textual communications, when the error is detected.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×