×

Identifying entities in semi-structured content

  • US 10,353,905 B2
  • Filed: 04/24/2015
  • Issued: 07/16/2019
  • Est. Priority Date: 04/24/2015
  • Status: Active Grant
First Claim
Patent Images

1. A system for identifying entities in semi-structured content, the system comprising:

  • one or more processors; and

    a non-transitory computer readable medium storing a plurality of instructions, which when executed, cause the one or more processors to;

    identify a sequence of tokens in the semi-structured content based on assigning an information score to each token;

    assign, by a first layer of classifications that is executed by a machine learning model of a database system, an entity type for each token in the sequence of tokens based on an entity score representing a probability that the token corresponds to the entity type, the entity type being one of a plurality of entity types and the entity score being a maximum score for correspondence between the token and any of the plurality of entity types;

    assign, by a second layer of classifications that is executed by the machine learning model, a structure score for each token in the sequence of tokens based on the token matching one of a plurality of structure types;

    re-assign, by the second layer, the corresponding entity type and corresponding entity score for each token in the sequence of tokens matching one of the structure types;

    assign, by a third layer of classifications that is executed by the machine learning model, a boundary type for each token in the sequence of tokens based on a boundary type score, the boundary type being one of a begin boundary type and a continue boundary type;

    identify, by a fourth layer of classifications that is executed by the machine learning model, an entity based on;

    i) the entity type and the boundary type for each token, and;

    ii) the structure score for each token; and

    output the sequence of tokens as an identified set of entities based on the identified entity.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×