Distributed data processing platform for metagenomic monitoring and characterization
First Claim
1. A method comprising:
- configuring a first processing node for communication with one or more additional processing nodes and with one or more of a plurality of geographically-distributed metagenomics sequencing centers via one or more networks;
processing metagenomics sequencing results obtained from one or more of the metagenomics sequencing centers in the first processing node; and
providing surveillance functionality relating to at least one designated biological issue on behalf of one or more requesting clients based at least in part on the processing of metagenomics sequencing results performed by the first processing node and related processing performed by one or more of the additional processing nodes;
wherein each of the metagenomics sequencing centers is configured to perform metagenomics sequencing on biological samples from respective sample sources in a corresponding data zone;
wherein processing the metagenomics sequencing results further comprises generating a hit abundance score vector for a given one of the biological samples wherein the hit abundance score vector comprises a plurality of entries corresponding to respective occurrence frequencies of at least one read of the given biological sample in respective target genomic sequences;
wherein providing surveillance functionality further comprises;
performing a preprocessing operation to reduce a biclustering sample space of a genomic comparison component;
generating a hit abundance score matrix for the genomic comparison component comprising a plurality of the hit abundance score vectors wherein one of rows and columns of the hit abundance score matrix correspond to respective different ones of the biological samples and the other of the rows and columns of the hit abundance score matrix correspond to respective different ones of the target genomic sequences; and
performing a biclustering operation on the hit abundance score matrix; and
wherein the method is implemented by at least one processing device comprising a processor coupled to a memory.
9 Assignments
0 Petitions
Accused Products
Abstract
A method comprises configuring a first processing node for communication with one or more additional processing nodes and with one or more of a plurality of geographically-distributed metagenomics sequencing centers via one or more networks, processing metagenomics sequencing results obtained from one or more of the metagenomics sequencing centers in the first processing node, and providing surveillance functionality relating to at least one designated biological issue on behalf of one or more requesting clients based at least in part on the processing of metagenomics sequencing results performed by the first processing node and related processing performed by one or more of the additional processing nodes. Each of the metagenomics sequencing centers is configured to perform metagenomics sequencing on biological samples from respective sample sources in a corresponding data zone.
87 Citations
20 Claims
-
1. A method comprising:
-
configuring a first processing node for communication with one or more additional processing nodes and with one or more of a plurality of geographically-distributed metagenomics sequencing centers via one or more networks; processing metagenomics sequencing results obtained from one or more of the metagenomics sequencing centers in the first processing node; and providing surveillance functionality relating to at least one designated biological issue on behalf of one or more requesting clients based at least in part on the processing of metagenomics sequencing results performed by the first processing node and related processing performed by one or more of the additional processing nodes; wherein each of the metagenomics sequencing centers is configured to perform metagenomics sequencing on biological samples from respective sample sources in a corresponding data zone; wherein processing the metagenomics sequencing results further comprises generating a hit abundance score vector for a given one of the biological samples wherein the hit abundance score vector comprises a plurality of entries corresponding to respective occurrence frequencies of at least one read of the given biological sample in respective target genomic sequences; wherein providing surveillance functionality further comprises; performing a preprocessing operation to reduce a biclustering sample space of a genomic comparison component; generating a hit abundance score matrix for the genomic comparison component comprising a plurality of the hit abundance score vectors wherein one of rows and columns of the hit abundance score matrix correspond to respective different ones of the biological samples and the other of the rows and columns of the hit abundance score matrix correspond to respective different ones of the target genomic sequences; and performing a biclustering operation on the hit abundance score matrix; and wherein the method is implemented by at least one processing device comprising a processor coupled to a memory. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A computer program product comprising a non-transitory processor-readable storage medium having stored therein program code of one or more software programs, wherein the program code when executed by at least one processing device causes said at least one processing device:
-
to configure a first processing node for communication with one or more additional processing nodes and with one or more of a plurality of geographically-distributed metagenomics sequencing centers via one or more networks; to process metagenomics sequencing results obtained from one or more of the metagenomics sequencing centers in the first processing node; and to provide surveillance functionality relating to at least one designated biological issue on behalf of one or more requesting clients based at least in part on the processing of metagenomics sequencing results performed by the first processing node and related processing performed by one or more of the additional processing nodes; wherein each of the metagenomics sequencing centers is configured to perform metagenomics sequencing on biological samples from respective sample sources in a corresponding data zone; wherein processing the metagenomics sequencing results further comprises generating a hit abundance score vector for a given one of the biological samples wherein the hit abundance score vector comprises a plurality of entries corresponding to respective occurrence frequencies of at least one read of the given biological sample in respective target genomic sequences; and wherein providing surveillance functionality further comprises; performing a preprocessing operation to reduce a biclustering sample space of a genomic comparison component; generating a hit abundance score matrix for the genomic comparison component comprising a plurality of the hit abundance score vectors wherein one of rows and columns of the hit abundance score matrix correspond to respective different ones of the biological samples and the other of the rows and columns of the hit abundance score matrix correspond to respective different ones of the target genomic sequences; and performing a biclustering operation on the hit abundance score matrix. - View Dependent Claims (13, 14, 15)
-
-
16. An apparatus comprising:
-
a first processing node configured for communication with one or more additional processing nodes and with one or more of a plurality of geographically-distributed metagenomics sequencing centers via one or more networks; the first processing node being further configured; to process metagenomics sequencing results obtained from one or more of the metagenomics sequencing centers; and to provide surveillance functionality relating to at least one designated biological issue on behalf of one or more requesting clients based at least in part on the processing of metagenomics sequencing results performed by the first processing node and related processing performed by one or more of the additional processing nodes; wherein each of the metagenomics sequencing centers is configured to perform metagenomics sequencing on biological samples from respective sample sources in a corresponding data zone; and wherein the first processing node is implemented using at least one processing device comprising a processor coupled to a memory; wherein processing the metagenomics sequencing results further comprises generating a hit abundance score vector for a given one of the biological samples wherein the hit abundance score vector comprises a plurality of entries corresponding to respective occurrence frequencies of at least one read of the given biological sample in respective target genomic sequences; and wherein providing the surveillance functionality further comprises; performing a preprocessing operation to reduce a biclustering sample space of a genomic comparison component; generating a hit abundance score matrix for the genomic comparison component comprising a plurality of the hit abundance score vectors wherein one of rows and columns of the hit abundance score matrix correspond to respective different ones of the biological samples and the other of the rows and columns of the hit abundance score matrix correspond to respective different ones of the target genomic sequences; and performing a biclustering operation on the hit abundance score matrix. - View Dependent Claims (17, 18, 19, 20)
-
Specification