×

Identifying a stale data source to improve NLP accuracy

  • US 10,387,468 B2
  • Filed: 03/14/2013
  • Issued: 08/20/2019
  • Est. Priority Date: 03/12/2013
  • Status: Active Grant
First Claim
Patent Images

1. A method comprising:

  • receiving a query for processing by a natural language processing (NLP) system comprising a corpus containing data ingested from a plurality of data sources, wherein the data is formatted and stored into one or more objects and organized based on topic changes;

    identifying a data source expected to contain an answer to the query using NLP, by;

    dividing words in the query into different elements, generating an annotation for each of the elements using the NLP system by determining a particular topic describing each of the elements, andidentifying a previously ingested data source in the corpus that is associated with previously-generated annotations matching the generated annotations for the elements;

    upon determining that the previously ingested data in the corpus does not contain the answer to the query, determining whether new material has been added to the identified data source since the last time the identified data source was ingested into the corpus;

    upon determining that new material has been added to the identified data source since the last time the identified data source was ingested into the corpus;

    re-ingesting the identified data source using one or more computer processors whereby the new material is inserted into the corpus; and

    processing the query to determine a lexical answer type for the query, based at least in part on a concept assigned to each of the elements, wherein the concepts were determined and assigned using NLP, and wherein the lexical answer type is a word or noun phrase that predicts a type of an answer to the query; and

    generating an answer to the query based on the new material inserted into the corpus and based on the lexical answer type.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×