×

Automated Extraction of Bio-Entity Relationships from Literature

  • US 20130339005A1
  • Filed: 08/20/2013
  • Published: 12/19/2013
  • Est. Priority Date: 03/30/2012
  • Status: Active Grant
First Claim
Patent Images

1. One or more non-transitory, tangible computer-readable media having computer-executable instructions for performing a method by running a software program on a computer, the computer operating under an operating system, the method including issuing instructions from the software program to extract semantic bio-entity relationships or patterns from non-annotated data by natural language processing and graph theoretic algorithm, the instructions comprising:

  • receiving a plurality of known bio-entity strings and a plurality of interaction word strings;

    receiving annotated text as training data that contains true and false patterns;

    automatically building a decision support tool based on said true and false patterns to which said non-annotated data can be parsed,said decision support tool including at least a first level and a second level, said first level having a first decision node, said second level having a second decision node, said first and second decision nodes each associated with at least a portion of said true and false patterns;

    receiving said non-annotated data;

    extracting a textual clause of said non-annotated data that contains non-triplet word strings and at least one triplet, said at least one triplet including a first bio-entity, a second bio-entity, and an interaction word;

    automatically parsing said extracted textual clause through said decision support tool to obtain a plurality of components based on dependencies among said plurality of components;

    extracting said at least one triplet from said plurality of components by attempting to match said plurality of components of said parsed, extracted textual clause to said first level of said decision support tool;

    identifying extraction of said at least one triplet as true if said plurality of components matches said first level of said decision support tool;

    identifying extraction of said at least one triplet as false if said plurality of components fails to match said first level of said decision support tool;

    as a result of said plurality of components failing to match said first level of said decision support tool, extracting said at least one triplet from said plurality of components by attempting to match said plurality of components to said second level of said decision support tool;

    identifying extraction of said at least one triplet as true if said plurality of components matches said second level of said decision support tool, said second level of said decision support tool being a simplified pattern of said first level of said decision support tool to capture textual clauses that are not identical to said extracted textual clause; and

    identifying extraction of said at least one triplet as false if said plurality of components fails to match said second level of said decision support tool.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×