×

Extracting structured knowledge from unstructured text

  • US 9,110,882 B2
  • Filed: 05/12/2011
  • Issued: 08/18/2015
  • Est. Priority Date: 05/14/2010
  • Status: Expired due to Fees
First Claim
Patent Images

1. A computer-implemented method for extracting structured knowledge from unstructured text for use in a knowledge representation system, the knowledge representation system comprising a knowledge base that represents knowledge using a structured, machine-readable format, the structured, machine-readable format comprising fact triples, the method comprising:

  • identifying sentences in the unstructured text using one or more computing devices;

    using the one or more computing devices, converting each of a subset of the sentences to one or more simplified assertion statements of the form;

    subject noun phrase, verb phrase, object noun phrase;

    converting each of a subset of the simplified assertion statements to a corresponding fact triple using the one or more computing devices, each fact triple being constructed from three knowledge base objects, the three knowledge base objects comprising two entity objects and a relationship object expressing a relationship between the two entity objects;

    using the one or more computing devices, grouping the fact triples into a plurality of quarantine groups such that each of the fact triples is included in more than one of the quarantine groups, each quarantine group being defined by a corresponding one of a plurality of fact characteristics, a first one of the fact characteristics being that all of the fact triples in the corresponding quarantine group include a same one of the entity objects, a second one of the fact characteristics being that all of the fact triples in the corresponding quarantine group include a same one of the relationship objects;

    determining a reliability for each quarantine group with reference to the knowledge base;

    determining that more than one of the quarantine groups in which a first fact triple is included has at least a specified reliability; and

    classifying the first fact triple as a reliable fact triple in response to determining that more than one of the quarantine groups in which the first fact triple is included has at least the specified reliability.

View all claims
  • 4 Assignments
Timeline View
Assignment View
    ×
    ×