Real-time or frequent ingestion by running pipeline in order of effectiveness
First Claim
Patent Images
1. A method, in a data processing system, for partial ingestion of content, the method comprising:
- identifying a set of features that contribute to generating candidate answers for input questions;
identifying a set of annotation engines in the ingestion pipeline that contribute to each of the set of features and at least one annotation engine on which the one or more annotation engines depend;
generating a sub-pipeline for each set of annotation engines to form a plurality of sub-pipelines of annotation engines;
receiving new content to be ingested into a corpus of information;
applying the plurality of sub-pipelines of annotation engines against the new content in order of effectiveness, wherein the plurality of sub-pipelines include all annotation engines of an ingestion pipeline and wherein each sub-pipeline within the plurality of sub-pipelines generates one or more intermediate output objects; and
providing access to the one or more intermediate output objects, wherein the one or more intermediate output objects represent the partially ingested new content.
2 Assignments
0 Petitions
Accused Products
Abstract
A mechanism is provided in a data processing system for partial ingestion of content. The mechanism receives new content to be ingested into a corpus of information. The mechanism applies a plurality of sub-pipelines of annotation engines against the new content in order of effectiveness. The plurality of sub-pipelines include all annotation engines of an ingestion pipeline. Each sub-pipeline within the plurality of sub-pipelines generates one or more intermediate output objects. The mechanism provides access to the one or more intermediate output objects.
-
Citations
20 Claims
-
1. A method, in a data processing system, for partial ingestion of content, the method comprising:
-
identifying a set of features that contribute to generating candidate answers for input questions; identifying a set of annotation engines in the ingestion pipeline that contribute to each of the set of features and at least one annotation engine on which the one or more annotation engines depend; generating a sub-pipeline for each set of annotation engines to form a plurality of sub-pipelines of annotation engines; receiving new content to be ingested into a corpus of information; applying the plurality of sub-pipelines of annotation engines against the new content in order of effectiveness, wherein the plurality of sub-pipelines include all annotation engines of an ingestion pipeline and wherein each sub-pipeline within the plurality of sub-pipelines generates one or more intermediate output objects; and providing access to the one or more intermediate output objects, wherein the one or more intermediate output objects represent the partially ingested new content. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A computer program product comprising a computer readable storage medium having a computer readable program stored therein, wherein the computer readable program, when executed on a computing device, causes the computing device to:
-
identify a set of features that contribute to generating candidate answers for input questions; identify a set of annotation engines in the ingestion pipeline that contribute to each of the set of features and at least one annotation engine on which the one or more annotation engines depend; generate a sub-pipeline for each set of annotation engines to form a plurality of sub-pipelines of annotation engines; receive new content to be ingested into a corpus of information; apply a plurality of sub-pipelines of annotation engines against the new content in order of effectiveness, wherein the plurality of sub-pipelines include all annotation engines of an ingestion pipeline and wherein each sub-pipeline within the plurality of sub-pipelines generates one or more intermediate output objects; and provide access to the one or more intermediate output objects, wherein the one or more intermediate output objects represent the partially ingested new content. - View Dependent Claims (13, 14, 15, 16)
-
-
17. An apparatus comprising:
-
a processor; and a memory coupled to the processor, wherein the memory comprises instructions which, when executed by the processor, cause the processor to; identify a set of features that contribute to generating candidate answers for input questions; identify a set of annotation engines in the ingestion pipeline that contribute to each of the set of features and at least one annotation engine on which the one or more annotation engines depend; generate a sub-pipeline for each set of annotation engines to form a plurality of sub-pipelines of annotation engines; receive new content to be ingested into a corpus of information; apply a plurality of sub-pipelines of annotation engines against the new content in order of effectiveness, wherein the plurality of sub-pipelines include all annotation engines of an ingestion pipeline and wherein each sub-pipeline within the plurality of sub-pipelines generates one or more intermediate output objects; and provide access to the one or more intermediate output objects, wherein the one or more intermediate output objects represent the partially ingested new content. - View Dependent Claims (18, 19, 20)
-
Specification