Corpus Quality Analysis
First Claim
Patent Images
1. A method, in a data processing system, for corpus quality analysis, the method comprising:
- applying at least one filter to a candidate corpus to determine a degree to which the candidate corpus supplements existing corpora for performing a natural language processing (NLP) operation;
responsive to a determination to add the candidate corpus to the existing corpora based on a result of applying the at least one filter, adding the candidate corpus to the existing corpora to form modified corpora; and
performing the NLP operation using the modified corpora.
1 Assignment
0 Petitions
Accused Products
Abstract
A mechanism is provided in a data processing system for corpus quality analysis. The mechanism applies at least one filter to a candidate corpus to determine a degree to which the candidate corpus supplements existing corpora for performing a natural language processing (NLP) operation. Responsive to a determination to add the candidate corpus to the existing corpora based on a result of applying the at least one filter, the mechanism adds the candidate corpus to the existing corpora to form modified corpora. The mechanism performs the NLP operation using the modified corpora.
55 Citations
20 Claims
-
1. A method, in a data processing system, for corpus quality analysis, the method comprising:
-
applying at least one filter to a candidate corpus to determine a degree to which the candidate corpus supplements existing corpora for performing a natural language processing (NLP) operation; responsive to a determination to add the candidate corpus to the existing corpora based on a result of applying the at least one filter, adding the candidate corpus to the existing corpora to form modified corpora; and performing the NLP operation using the modified corpora. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
-
-
15. A computer program product comprising a computer readable storage medium having a computer readable program stored therein, wherein the computer readable program, when executed on a question answering system, causes the question answering system to:
-
apply at least one filter to a candidate corpus to determine a degree to which the candidate corpus supplements existing corpora for performing a natural language processing (NLP) operation; responsive to a determination to add the candidate corpus to the existing corpora based on a result of applying the at least one filter, add the candidate corpus to the existing corpora to form modified corpora; and perform the NLP operation using the modified corpora. - View Dependent Claims (16, 17, 18, 19)
-
-
20. An apparatus comprising:
-
a processor; and a memory coupled to the processor, wherein the memory comprises instructions which, when executed by the processor, cause the processor to; apply at least one filter to a candidate corpus to determine a degree to which the candidate corpus supplements existing corpora for performing a natural language processing (NLP) operation; responsive to a determination to add the candidate corpus to the existing corpora based on a result of applying the at least one filter, add the candidate corpus to the existing corpora to form modified corpora; and perform the NLP operation using the modified corpora.
-
Specification