Reassembling streaming data across multiple packetized communication channels
First Claim
1. A method, in a data processing system comprising at least one processor, for processing streaming data at high sustained data rates, comprising:
- receiving, in the data processing system, a plurality of data elements over a plurality of non-sequential communication channels;
writing the plurality of data elements directly to a file system of a storage system associated with the data processing system in an unassembled manner such that data elements received over at least one of the non-sequential communication channels are written in an intermixed manner with data elements received over at least one other non-sequential communication channel, in the file system of the storage system, wherein the data elements are written directly to the file system without validity or completeness checks being performed prior to writing the data elements directly to the file system;
performing, by the data processing system, a data scrubbing operation to determine if there are any missing data elements that are not present in the plurality of data elements written to the file system of the storage system;
assembling, by the data processing system, the plurality of data elements into a plurality of data streams associated with the plurality of non-sequential communication channels in response to results of the data scrubbing indicating that there are no missing data elements; and
releasing, by the data processing system, the assembled plurality of data streams for access via the file system of the storage system.
2 Assignments
0 Petitions
Accused Products
Abstract
Mechanisms are provided for processing streaming data at high sustained data rates. These mechanisms receive a plurality of data elements over a plurality of non-sequential communication channels and write the plurality of data elements directly to the file system of the data processing system in an unassembled manner. The mechanisms further perform a data scrubbing operation to determine if there are any missing data elements that are not present in the plurality of data elements written to the file system and assemble the plurality of data elements into a plurality of data streams associated with the plurality of non-sequential communication channels in response to results of the data scrubbing indicating that there are no missing data elements. In addition, the mechanisms release the assembled plurality of data streams for access via the file system.
-
Citations
20 Claims
-
1. A method, in a data processing system comprising at least one processor, for processing streaming data at high sustained data rates, comprising:
-
receiving, in the data processing system, a plurality of data elements over a plurality of non-sequential communication channels; writing the plurality of data elements directly to a file system of a storage system associated with the data processing system in an unassembled manner such that data elements received over at least one of the non-sequential communication channels are written in an intermixed manner with data elements received over at least one other non-sequential communication channel, in the file system of the storage system, wherein the data elements are written directly to the file system without validity or completeness checks being performed prior to writing the data elements directly to the file system; performing, by the data processing system, a data scrubbing operation to determine if there are any missing data elements that are not present in the plurality of data elements written to the file system of the storage system; assembling, by the data processing system, the plurality of data elements into a plurality of data streams associated with the plurality of non-sequential communication channels in response to results of the data scrubbing indicating that there are no missing data elements; and releasing, by the data processing system, the assembled plurality of data streams for access via the file system of the storage system. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A method, in a data processing system comprising at least one processor, for processing streaming data at high sustained data rates, comprising:
-
receiving, in the data processing system, a plurality of data elements over a plurality of non-sequential communication channels from a plurality of processors; writing the plurality of data elements directly to a file system of a storage system associated with the data processing system in an unassembled manner; performing, by the data processing system, a data scrubbing operation to determine if there are any missing data elements that are not present in the plurality of data elements written to the file system of the storage system; assembling, by the data processing system, the plurality of data elements into a plurality of data streams associated with the plurality of non-sequential communication channels in response to results of the data scrubbing indicating that there are no missing data elements; and releasing, by the data processing system, the assembled plurality of data streams for access via the file system of the storage system, wherein the plurality of processors are part of a distributed computing cluster operating on message passing interface (MPI) jobs, and wherein the plurality of data elements are transmitted by the plurality of processors in response to the processors encountering an MPI barrier operation.
-
-
11. A computer program product comprising a computer recordable device having a computer readable program recorded thereon, wherein the computer readable program, when executed on a computing device comprising at least one processor, causes the computing device to:
-
receive, by the computing device, a plurality of data elements over a plurality of non-sequential communication channels; write, by the computing device, the plurality of data elements directly to a file system of the data processing system in an unassembled manner such that data elements received over at least one of the non-sequential communication channels are written in an intermixed manner with data elements received over at least one other non-sequential communication channel, in the file system, wherein the data elements are written directly to the file system without validity or completeness checks being performed prior to writing the data elements directly to the file system; perform, by the computing device, a data scrubbing operation to determine if there are any missing data elements that are not present in the plurality of data elements written to the file system; assemble, by the computing device, the plurality of data elements into a plurality of data streams associated with the plurality of non-sequential communication channels in response to results of the data scrubbing indicating that there are no missing data elements; and release, by the computing device, the assembled plurality of data streams for access via the file system. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18)
-
-
19. A computer program product comprising a computer recordable device having a computer readable program recorded thereon, wherein the computer readable program, when executed on a computing device comprising at least one processor, causes the computing device to:
-
receive, by the computing device, a plurality of data elements over a plurality of non-sequential communication channels from a plurality of processors; write, by the computing device, the plurality of data elements directly to a file system of the data processing system in an unassembled manner; perform, by the computing device, a data scrubbing operation to determine if there are any missing data elements that are not present in the plurality of data elements written to the file system; assemble, by the computing device, the plurality of data elements into a plurality of data streams associated with the plurality of non-sequential communication channels in response to results of the data scrubbing indicating that there are no missing data elements; and release, by the computing device, the assembled plurality of data streams for access via the file system, wherein the plurality of processors are part of a distributed computing cluster operating on message passing interface (MPI) jobs, and wherein the plurality of data elements are transmitted by the plurality of processors in response to the processors encountering an MPI barrier operation.
-
-
20. An apparatus, comprising:
-
a processor; and a memory coupled to the processor, wherein the memory comprises instructions which, when executed by the processor, cause the processor to; receive a plurality of data elements over a plurality of non-sequential communication channels; write the plurality of data elements directly to a file system of the data processing system in an unassembled manner such that data elements received over at least one of the non-sequential communication channels are written in an intermixed manner with data elements received over at least one other non-sequential communication channel, in the file system, wherein the data elements are written directly to the file system without validity or completeness checks being performed prior to writing the data elements directly to the file system; perform a data scrubbing operation to determine if there are any missing data elements that are not present in the plurality of data elements written to the file system; assemble the plurality of data elements into a plurality of data streams associated with the plurality of non-sequential communication channels in response to results of the data scrubbing indicating that there are no missing data elements; and release the assembled plurality of data streams for access via the file system.
-
Specification