×

Error correction in tables using a question and answer system

  • US 9,830,314 B2
  • Filed: 11/18/2013
  • Issued: 11/28/2017
  • Est. Priority Date: 11/18/2013
  • Status: Expired due to Fees
First Claim
Patent Images

1. A method, in a data processing system comprising a processor and a memory, for performing tabular data correction in a document, the method comprising:

  • configuring the processor to implement a natural language processing (NLP) system that performs natural language processing on natural language content at least by processing logical relationships in the natural language content;

    responsive to receiving a natural language document with table structures and functional dependencies identified therein;

    configuring an erroneous data value analysis engine to analyze a portion of content within a natural language document to identify an erroneous sub-portion within the natural language document comprising an erroneous or missing item of information;

    analyzing, by the erroneous data value analysis engine executed by the processor in the data processing system, a portion of content within a natural language document to identify an erroneous sub-portion within the natural language document comprising an erroneous or missing item of information, wherein the portion of content comprises a table data structure present in the natural language document, wherein the erroneous sub-portion comprises a cell in the table data structure having an erroneous or missing data value, wherein the erroneous sub-portion is identified based on the erroneous sub-portion failing to conform with a regular structure associated with the portion of content within the natural language document, and wherein the regular structure is a repeatable pattern within the portion of content within the natural language document;

    configuring a question generation engine to generate a semantic signature for the erroneous sub-portion;

    generating, by the question generation engine executed by the processor in the data processing system, the semantic signature for the erroneous sub-portion, wherein generating the semantic signature for the erroneous sub-portion comprises;

    performing, by the question generation engine, a discovery of functional dependencies between the erroneous or missing data value in the cell of the table data structure and a second portion of content, wherein the functional dependencies are indicated by the repeatable pattern;

    analyzing, by the question generation engine, context information surrounding the erroneous sub-portion of content in the natural language document utilizing the functional dependencies between the erroneous or missing data value in the cell of the table data structure and a second portion of content; and

    converting, by the question generation engine, the context information into a narrated statement using a narration mechanism;

    configuring a Question and Answer (QA) system to generate a query based on the semantic signature and apply the query to a knowledge base to identify a candidate sub-portion of content for correcting the erroneous sub-portion;

    generating, by the Question and Answer (QA) system executed by the processor in the data processing system, the query based on the semantic signature;

    applying, by the QA system, the query to the knowledge base to identify the candidate sub-portion of content for correcting the erroneous sub-portion;

    configuring a correction engine to correct the erroneous sub-portion to generate a corrected natural language document and store the corrected natural language document;

    correcting, by the correction engine executed by the processor in the data processing system, the erroneous sub-portion using the identified candidate sub-portion of content to generate the corrected natural language document; and

    storing, by the correction engine, the corrected natural language document in a storage device.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×