System and method for supplementing a question answering system with mixed-language source documents
First Claim
1. A computer implemented method in a data processing system comprising a processor and a memory comprising instructions, which are executed by the processor to cause the processor to determine language independent candidate answers, the method comprising:
- receiving a natural language question;
parsing the question using natural language processing techniques to extract one or more facts from the question;
using a cognitive system to convert the one or more facts from the question to one or more acyclic graphs, wherein nodes in the one or more acyclic graphs represent facts and connectors in the one or more acyclic graphs represent connections between two or more facts;
using the cognitive system to analyze the one or more acyclic graphs to determine at least one knowledge domain of the question;
querying candidate answers having knowledge domains similar to the at least one knowledge domain of the question, wherein the candidate answers are queried from a mixed-language corpora of data;
applying deep analysis of the natural language question and each candidate answer, wherein the deep analysis uses a plurality of reasoning algorithms generating dependency scores;
training a statistical model employed by the cognitive system;
applying the trained statistical model to determine a weight for each dependency score;
applying the weight to each dependency score to generate weighted dependency scores;
processing the weighted dependency scores with the statistical model to generate one or more confidence scores for each candidate answer; and
outputting the candidate answer with the highest confidence score.
1 Assignment
0 Petitions
Accused Products
Abstract
Embodiments can provide a computer implemented method, in a data processing system comprising a processor and a memory comprising instructions which are executed by the processor to cause the processor to implement a mixed-language question answering supplement system, the method comprising receiving a question in a target language; applying natural language processing to parse the question into at least one focus; for each focus, determining if one or more target language verbs share direct syntactic dependency with the focus; for each of the one or more verbs sharing direct syntactic dependency, determining if one or more target language entities share direct syntactic dependency with the verb; determining one or more Abstract Universal Verbal Types associated with each verb; for each of the one or more Abstract Universal Verbal Types, determining whether a dependency between a source language entity and a source language verb is of the same type as the dependency between the target language verb and the target language entity; if the dependency is similar, returning the source language entity as a member of a set; and if the set is full, returning an answer in the target language to the question in the target language.
-
Citations
20 Claims
-
1. A computer implemented method in a data processing system comprising a processor and a memory comprising instructions, which are executed by the processor to cause the processor to determine language independent candidate answers, the method comprising:
-
receiving a natural language question; parsing the question using natural language processing techniques to extract one or more facts from the question; using a cognitive system to convert the one or more facts from the question to one or more acyclic graphs, wherein nodes in the one or more acyclic graphs represent facts and connectors in the one or more acyclic graphs represent connections between two or more facts; using the cognitive system to analyze the one or more acyclic graphs to determine at least one knowledge domain of the question; querying candidate answers having knowledge domains similar to the at least one knowledge domain of the question, wherein the candidate answers are queried from a mixed-language corpora of data; applying deep analysis of the natural language question and each candidate answer, wherein the deep analysis uses a plurality of reasoning algorithms generating dependency scores; training a statistical model employed by the cognitive system; applying the trained statistical model to determine a weight for each dependency score; applying the weight to each dependency score to generate weighted dependency scores;
processing the weighted dependency scores with the statistical model to generate one or more confidence scores for each candidate answer; andoutputting the candidate answer with the highest confidence score. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A computer program product for language independent question answering supplementation, the computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by a processor to cause the processor to:
-
receive a natural language question; parse the question using natural language processing techniques to extract one or more facts from the question; use a cognitive system to convert the one or more facts from the question to one or more acyclic graphs, wherein nodes in the one or more acyclic graphs represent facts and connectors in the one or more acyclic graphs represent connections between two or more facts; use the cognitive system to analyze the one or more acyclic graphs to determine at least one knowledge domain of the question; determine candidate answers having knowledge domains similar to the at least one knowledge domain of the question, wherein the candidate answers are determined independent of the language of the question; apply deep analysis of the natural language question and each candidate answer to score the candidate answers according to the likelihood that the candidate answer is a correct answer for the question, wherein the deep analysis uses a plurality of reasoning algorithms generating dependency scores; train a statistical model employed by the cognitive system; apply the trained statistical model to determine a weight for each dependency score; apply the weight to each dependency score to generate weighted dependency scores; process the weighted dependency scores with the statistical model to generate one or more confidence scores for each candidate answer; and output the candidate answer with the highest confidence score. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A language independent question answering supplementation system, comprising:
a language independent question answering supplementation processor configured to; receive a natural language question; parse the question using natural language processing techniques to extract one or more facts from the question; use a cognitive system to convert the one or more facts from the question to one or more acyclic graphs, wherein nodes in the one or more acyclic graphs represent facts and connectors in the one or more acyclic graphs represent connections between two or more facts; use the cognitive system to analyze the one or more acyclic graphs to determine at least one knowledge domain of the question; determine candidate answers having knowledge domains similar to the at least one knowledge domain of the question, wherein the candidate answers are determined independent of the language of the question; apply deep analysis of the natural language text of the question and each candidate answer to determine candidate answers in the natural language of the question, wherein the deep analysis uses a plurality of reasoning algorithms generating dependency scores; train a statistical model employed by the cognitive system; apply the trained statistical model to determine a weight for each dependency score; apply the weight to each dependency score to generate weighted dependency scores; process the weighted dependency scores with the statistical model to generate one or more confidence scores for each candidate answer; and output the candidate answer with the highest confidence score. - View Dependent Claims (20)
Specification