×

Automatic extraction of named entities from texts

  • US 9,588,960 B2
  • Filed: 10/07/2014
  • Issued: 03/07/2017
  • Est. Priority Date: 01/15/2014
  • Status: Active Grant
First Claim
Patent Images

1. A method comprising:

  • identifying, by a processor, a set of training texts;

    extracting, by the processor, a respective set of features for each of the training texts;

    training, by the processor, a classification model using the training texts and the extracted features;

    extracting, by the processor, a token from a natural language text;

    identifying, by the processor, a set of token attributes associated with the token based on a semantic-syntactic analysis of the natural language text, wherein the set of token attributes comprises at least one of a lexical attribute, a syntactic attribute, or a semantic attribute, and wherein the semantic-syntactic analysis of the natural language text comprises;

    generating, by the processor, a lexical-morphological structure of a sentence of the natural language text;

    identifying, by the processor, a syntactic tree using the lexical-morphological structure;

    generating, by the processor, a language-independent semantic structure based on the syntactic tree; and

    identifying, by the processor, the set of token attributes using the language-independent semantic structure;

    determining, by the processor, a category for the token based on the trained classification model and the set of token attributes; and

    generating, by the processor, a tagged representation of at least part of the natural language text, the tagged representation referencing the category for the token.

View all claims
  • 5 Assignments
Timeline View
Assignment View
    ×
    ×