Data pipeline architecture for cloud processing of structured and unstructured data
First Claim
Patent Images
1. A system comprising:
- a data stream processing pipeline comprising sequential multiple processing stages, including;
an integration stage comprising a connector interface configurable to connect multiple incoming data streams from multiple different data stream input sources to multiple different ingestion processors in an ingestion stage; and
a storage stage comprising a memory hierarchy with multiple different memory bandwidths, the memory hierarchy comprising;
a database repository configured to support a batch target processing bandwidth of the multiple different memory bandwidths;
an in-memory data store configured to support an in-memory target processing bandwidth of the multiple different memory bandwidths, the in-memory target processing bandwidth being faster than the batch target processing bandwidth; and
pipeline configuration circuitry operable to;
read a pipeline configuration for data stream processing through the data stream processing pipeline; and
configure the multiple processing stages according to the pipeline configuration to accept and process an individual incoming data stream of the multiple incoming data streams for delivery to the memory hierarchy according to a selected target processing bandwidth of the multiple different memory bandwidths, the selected target processing bandwidth defined for the individual incoming data stream.
1 Assignment
0 Petitions
Accused Products
Abstract
Scalable architectures provide resiliency and redundancy and are suitable for cloud deployment. The architectures support extreme data throughput requirements. In one implementation, the architectures provide a serving layer and an extremely high speed processing lane. With these and other features, the architectures support complex analytics, visualization, rule engines, and centralized pipeline configuration.
-
Citations
20 Claims
-
1. A system comprising:
-
a data stream processing pipeline comprising sequential multiple processing stages, including; an integration stage comprising a connector interface configurable to connect multiple incoming data streams from multiple different data stream input sources to multiple different ingestion processors in an ingestion stage; and a storage stage comprising a memory hierarchy with multiple different memory bandwidths, the memory hierarchy comprising; a database repository configured to support a batch target processing bandwidth of the multiple different memory bandwidths; an in-memory data store configured to support an in-memory target processing bandwidth of the multiple different memory bandwidths, the in-memory target processing bandwidth being faster than the batch target processing bandwidth; and pipeline configuration circuitry operable to; read a pipeline configuration for data stream processing through the data stream processing pipeline; and configure the multiple processing stages according to the pipeline configuration to accept and process an individual incoming data stream of the multiple incoming data streams for delivery to the memory hierarchy according to a selected target processing bandwidth of the multiple different memory bandwidths, the selected target processing bandwidth defined for the individual incoming data stream. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A method comprising:
in a data processing architecture; reading a pipeline configuration for data stream processing through a multiple stage data stream processing pipeline including an integration stage and an ingestion stage; configuring, responsive to the pipeline configuration, the integration stage to accept and process multiple incoming data streams from multiple endpoints; at the ingestion stage, ingesting the multiple incoming streams to create multiple ingested data streams; routing, responsive to the pipeline configuration, selected ingested data stream of the multiple ingested data streams from the ingestion stage to a memory hierarchy defining data storage options for multiple different target processing bandwidths, the memory hierarchy comprising; a database repository configured to support a batch target processing bandwidth of the multiple different target processing bandwidths; an in-memory data store configured to support an in-memory target processing bandwidth of the multiple different target processing bandwidths, the in-memory target processing bandwidth being faster than the batch target processing bandwidth; and selecting from among the data storage options responsive to a selected target processing bandwidth defined for the selected ingested data stream. - View Dependent Claims (10, 11, 12, 13, 14)
-
15. A system comprising:
-
a data stream processing pipeline comprising sequential multiple processing stages, including; an integration stage comprising a connector interface configurable from multiple different data stream input sources to multiple different ingestion processors; an ingestion stage comprising the multiple different ingestion processors; a storage stage comprising a memory hierarchy defining data stores with multiple different memory bandwidths, the memory hierarchy comprising; a database repository configured to support a batch target processing bandwidth of the multiple different memory bandwidths; an in-memory data store configured to support an in-memory target processing bandwidth of the multiple different memory bandwidths, the in-memory target processing bandwidth being faster than the batch target processing bandwidth; a data stream processing stage comprising multiple different data processors; a visualization stage comprising multiple different visualization processors; and a service stage operable to connect the data stream processing stage to the visualization stage; and pipeline configuration circuitry operable to; read a pipeline configuration file for configuring data stream processing through the data stream processing pipeline; and configure the multiple processing stages to accept and process an incoming data stream for delivery to the memory hierarchy according to a selected target processing bandwidth defined for the incoming data stream, through the data stream processing stage, and to the visualization stage. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification