Distributed data analytics
First Claim
1. A method comprising:
- obtaining reads of biological samples of respective microbiomes wherein each of the biological samples contains genomic material from a plurality of distinct microorganisms of its corresponding one of the microbiomes; and
performing distributed data analytics to detect a disease, infection or contamination that involves genomic material from multiple ones of the distinct microorganisms in one or more of the microbiomes;
wherein performing distributed data analytics comprises;
performing local analytics in respective ones of a plurality of data zones; and
performing global analytics utilizing results of the local analytics performed in the respective data zones;
wherein each of the data zones comprises one or more sequencing centers utilized to generate a corresponding subset of the reads within that data zone;
wherein the local analytics performed in a given one of the data zones utilize reads of one or more of the biological samples sequenced in the one or more sequencing centers of the given data zone;
wherein the local analytics performed in the given data zone comprise analyzing the reads of the one or more biological samples against a local set of known virulence factors; and
wherein the method is implemented by a distributed data processing system comprising a plurality of processing devices configured to communicate with one another over at least one network.
7 Assignments
0 Petitions
Accused Products
Abstract
An apparatus in one embodiment comprises a distributed data processing system in which multiple processing devices communicate with one another over at least one network. The distributed data processing system is configured to obtain reads of biological samples of respective microbiomes, with each of the biological samples containing genomic material from a plurality of distinct microorganisms of its corresponding one of the microbiomes, and to perform distributed data analytics to detect a disease, infection or contamination that involves genomic material from multiple ones of the distinct microorganisms in one or more of the microbiomes. Performing distributed data analytics illustratively comprises performing local analytics in respective ones of a plurality of data zones, and performing global analytics utilizing results of the local analytics performed in the respective data zones. Each of the data zones may comprise, for example, one or more sequencing centers utilized to generate a corresponding subset of the reads within that data zone.
285 Citations
20 Claims
-
1. A method comprising:
-
obtaining reads of biological samples of respective microbiomes wherein each of the biological samples contains genomic material from a plurality of distinct microorganisms of its corresponding one of the microbiomes; and performing distributed data analytics to detect a disease, infection or contamination that involves genomic material from multiple ones of the distinct microorganisms in one or more of the microbiomes; wherein performing distributed data analytics comprises; performing local analytics in respective ones of a plurality of data zones; and performing global analytics utilizing results of the local analytics performed in the respective data zones; wherein each of the data zones comprises one or more sequencing centers utilized to generate a corresponding subset of the reads within that data zone; wherein the local analytics performed in a given one of the data zones utilize reads of one or more of the biological samples sequenced in the one or more sequencing centers of the given data zone; wherein the local analytics performed in the given data zone comprise analyzing the reads of the one or more biological samples against a local set of known virulence factors; and wherein the method is implemented by a distributed data processing system comprising a plurality of processing devices configured to communicate with one another over at least one network. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A computer program product comprising a non-transitory processor-readable storage medium having stored therein program code of one or more software programs, wherein the program code when executed by a distributed data processing system comprising a plurality of processing devices configured to communicate with one another over at least one network causes said distributed data processing system:
-
to obtain reads of biological samples of respective microbiomes wherein each of the biological samples contains genomic material from a plurality of distinct microorganisms of its corresponding one of the microbiomes; and to perform distributed data analytics to detect a disease, infection or contamination that involves genomic material from multiple ones of the distinct microorganisms in one or more of the microbiomes; wherein performing distributed data analytics comprises; performing local analytics in respective ones of a plurality of data zones; and performing global analytics utilizing results of the local analytics performed in the respective data zones; wherein each of the data zones comprises one or more sequencing centers utilized to generate a corresponding subset of the reads within that data zone; wherein the local analytics performed in a given one of the data zones utilize reads of one or more of the biological samples sequenced in the one or more sequencing centers of the given data zone; and wherein the local analytics performed in the given data zone comprise analyzing the reads of the one or more biological samples against a local set of known virulence factors. - View Dependent Claims (13)
-
-
14. An apparatus comprising:
-
a distributed data processing system comprising a plurality of processing devices configured to communicate with one another over at least one network; wherein said distributed data processing system is configured; to obtain reads of biological samples of respective microbiomes wherein each of the biological samples contains genomic material from a plurality of distinct microorganisms of its corresponding one of the microbiomes; and to perform distributed data analytics to detect a disease, infection or contamination that involves genomic material from multiple ones of the distinct microorganisms in one or more of the microbiomes; wherein performing distributed data analytics comprises; performing local analytics in respective ones of a plurality of data zones; and performing global analytics utilizing results of the local analytics performed in the respective data zones; wherein each of the data zones comprises one or more sequencing centers utilized to generate a corresponding subset of the reads within that data zone; wherein the local analytics performed in a given one of the data zones utilize reads of one or more of the biological samples sequenced in the one or more sequencing centers of the given data zone; wherein the local analytics performed in the given data zone comprise analyzing the reads of the one or more biological samples against a local set of known virulence factors. - View Dependent Claims (15, 16, 17, 18, 19, 20)
-
Specification