×

SYSTEM AND METHOD FOR ENTITY EXTRACTION FROM SEMI-STRUCTURED TEXT DOCUMENTS

  • US 20170300565A1
  • Filed: 04/14/2016
  • Published: 10/19/2017
  • Est. Priority Date: 04/14/2016
  • Status: Active Grant
First Claim
Patent Images

1. A method for extracting entities from a text document comprising:

  • for at least a section of a text document,providing a first set of entities extracted from the at least a section;

    clustering at least a subset of the extracted entities in the first set into clusters, based on locations of the entities in the document;

    identifying complete clusters of entities from the clusters;

    learning patterns for extracting new entities based on the complete clusters; and

    extracting new entities from incomplete clusters based on the learned patterns,wherein at least one of the providing of the first set of entities, identifying complete clusters, learning patterns and extracting new entities is performed with a processor device.

View all claims
  • 6 Assignments
Timeline View
Assignment View
    ×
    ×