×

IDENTIFYING GLOSSARY TERMS FROM NATURAL LANGUAGE TEXT DOCUMENTS

  • US 20140163966A1
  • Filed: 11/27/2013
  • Published: 06/12/2014
  • Est. Priority Date: 12/06/2012
  • Status: Active Grant
First Claim
Patent Images

1. A device, comprising:

  • one or more processors to;

    obtain text of a document to be analyzed to identify glossary terms included in the text;

    perform a linguistic unit analysis on a linguistic unit, included in the text, to generate a plurality of ambiguous linguistic units from the linguistic unit;

    resolve the plurality of ambiguous linguistic units to generate a set of potential glossary terms that includes a subset of the plurality of ambiguous linguistic units;

    perform a glossary term analysis on the set of potential glossary terms to generate a set of glossary terms that includes a subset of the set of potential glossary terms;

    identify a set of included terms, of the set of potential glossary terms, that are included in the set of glossary terms;

    identify a set of excluded terms, of the set of potential glossary terms, that are excluded from the set of glossary terms;

    determine a semantic relatedness score between at least one excluded term, of the set of excluded terms, and at least one included term, of the set of included terms;

    selectively add the excluded linguistic term to the set of glossary terms to form a final set of glossary terms based on the semantic relatedness score; and

    output the final set of glossary terms for the document.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×