×

Methods and systems for extracting information from text

  • US 9,110,852 B1
  • Filed: 07/20/2012
  • Issued: 08/18/2015
  • Est. Priority Date: 07/20/2012
  • Status: Active Grant
First Claim
Patent Images

1. A computer-implemented method for automatically identifying entity-value pairs from text, the method comprising the following operations performed by at least one processor:

  • receiving an electronic text file including a text corpus comprising a plurality of words;

    generating, by parsing the text corpus, a corresponding parse tree structure in memory, including nodes of each of the plurality of words having edges based on the parts of speech of the plurality of words;

    identifying a plurality of entity-value pairs in the text corpus that correspond to a predetermined entity and a predetermined value related to the predetermined entity by a predetermined attribute, wherein each of the entity-value pairs comprise an entity and a value;

    extracting based on the parse tree structure, a plurality of parse tree paths to traverse the tree structure from a node corresponding the entity to a node corresponding to the value of the plurality of entity-value pairs;

    generating a data record including an indication of how accurately the extracted plurality of parse tree paths correspond to the predetermined attribute, based on at least one of the plurality of parse tree paths; and

    validating an entity-value pair based on the data record.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×