PROCESS FOR IDENTIFYING COMPLETION OF DOMAIN ADAPTATION DICTIONARY ACTIVITIES
First Claim
1. An apparatus comprising:
- a memory; and
a processor coupled to the memory and configured to;
identify a corpus of documents of an evaluation domain;
generate a first lexicon based on the corpus of documents of the evaluation domain;
determine a threshold that indicates a sufficiency of domain adaptation of the evaluation domain based at least in part on the first lexicon;
identify a corpus of documents of a client domain;
generate a second lexicon based on the corpus of documents of the client domain;
determine a metric associated with the corpus of documents of the client domain and the second lexicon; and
determine that domain adaptation of the client domain is complete when the metric exceeds the threshold.
2 Assignments
0 Petitions
Accused Products
Abstract
An apparatus comprising a memory and a processor configured for semi-autonomous natural language processing domain adaptation related activities. The processor coupled to the memory and configured to identify a corpus of documents of an evaluation domain and generate a first lexicon based on the corpus of documents of the evaluation domain, and determine a threshold that indicates a sufficiency of domain adaptation of the evaluation domain based at least in part on the first lexicon. The processor is further configured to identify a corpus of documents of a client domain, generate a second lexicon based on the corpus of documents of the client domain, determine a metric associated with the corpus of documents of the client domain and the second lexicon, and determine that domain adaptation of the client domain is complete when the metric exceeds the threshold.
-
Citations
20 Claims
-
1. An apparatus comprising:
-
a memory; and a processor coupled to the memory and configured to; identify a corpus of documents of an evaluation domain; generate a first lexicon based on the corpus of documents of the evaluation domain; determine a threshold that indicates a sufficiency of domain adaptation of the evaluation domain based at least in part on the first lexicon; identify a corpus of documents of a client domain; generate a second lexicon based on the corpus of documents of the client domain; determine a metric associated with the corpus of documents of the client domain and the second lexicon; and determine that domain adaptation of the client domain is complete when the metric exceeds the threshold. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A computer-implemented method comprising:
-
identifying, by a processor, a corpus of documents from within a domain; determining, by the processor, an evaluation question for use with a question answering system to determine an answer to the evaluation question based on content of the domain; partitioning the corpus of documents into a plurality of sub-corpora; generating a lexicon for each of the respective sub-corpora; generating a plurality of test systems each corresponding uniquely to one of the plurality of sub-corpora; evaluating the evaluation question using the plurality of test systems to determine a plurality of evaluation results each corresponding uniquely to one of the plurality of test systems; and determining a threshold for sufficiency of domain adaptation based on at least one of the evaluation results. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A computer program product for performing domain adaptation of a domain, the computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by a processor to cause the processor to:
-
identify a corpus of documents from within a client domain; divide the corpus of documents into a plurality of sub-corpora; extract at least one domain term from each of the plurality of sub-corpora, wherein domain terms extracted from one of the plurality of sub-corpora form a lexicon for that respective sub-corpora of the plurality of sub-corpora; determine a metric having a relationship to the lexicon for that respective sub-corpora of the plurality of sub-corpora; and determine, based at least in part on the metric, that sufficient domain adaptation of the client domain has been performed. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification