×

Identifying entities in email signature blocks

  • US 10,110,533 B2
  • Filed: 10/28/2014
  • Issued: 10/23/2018
  • Est. Priority Date: 10/28/2014
  • Status: Active Grant
First Claim
Patent Images

1. A system for identifying entities in email signature blocks, the apparatus comprising:

  • one or more processors; and

    a non-transitory computer readable medium storing a plurality of instructions, which when executed, cause the one or more processors to;

    create a plurality of scores for each token, in a sequence of tokens from an email signature block, based on a corresponding independent probability distribution that has been previously trained for a plurality of entity types, wherein each token comprises one of a word, a punctuation symbol, and an end-of-line character, an entity being a part of one of a person name, a job title, an enterprise name, a telephone number, an email address, and a uniform resource locator, and being associated with at least one of an entity type, an entity sequence, and a set of entities;

    identify each entity sequence that has a total number of entities that is identical to a total number of tokens in the sequence of tokens;

    determine, for each of the identified entity sequences, an entity sequence score by combining corresponding scores for each token in the sequence of tokens, that corresponds to an entity type in an identified entity sequence;

    identify an entity sequence from the identified entity sequences with a highest entity sequence score; and

    output the sequence of tokens as an identified set of entities, in the email signature block, based on the entity sequence with the highest score.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×