Ultra-sensitive and Universal Species Detection Pipeline for Next Generation Sequencing Data
First Claim
1. An experimental pipeline for pan-domain species nucleotide extraction and next generation sequencing library preparation, comprising:
- a) preparing DNA/RNA genome libraries from collected metagenome samples at a single bioparticle sensitivity,wherein a radiation-based reagent-decontamination protocol is implemented to remove any traces of amplifiable nucleotides from reagents,to wherein a sterilized hydrophobic or hydrophilic filter membrane is used to collect nano-gram-level amount of airborne biological particulates at the minimum in between 12 to 72 hours,wherein, with an amount range above 1 ng of nucleotides of airborne biological samples, extracting DNA and RNA concurrently from the collected samples of any nature by a chemistry/physics-based pipeline at a single bacterial particle sensitivity,wherein a non-PCR-based linear amplification technique is applied for DNA and RNA after converted to complementary DNA, andwherein, with limited RNAs, depleting most ribosome RNA with selective primers for preferential enrichment of non-ribosome RNA sequences; and
(b) using cocktail enzyme to fragmentize DNA/cDNA generated to minimize the loss of nucleotide materials.
2 Assignments
0 Petitions
Accused Products
Abstract
A presumption-free pipeline is provided that employs experimental and analytic modules to profile samples, including clinical samples, regardless of the complexity and abundance. The experimental and analytic modules can work independently of each other if the user so desired. The experimental module for ultra-sensitive DNA/RNA extraction and sequencing can be used to extract information from any samples to feed into analytical pipelines chosen by the user. Alternatively, the analytic module for universal species detection can be fed with data generated with other experimental pipelines and different sequencing platforms.
-
Citations
2 Claims
-
1. An experimental pipeline for pan-domain species nucleotide extraction and next generation sequencing library preparation, comprising:
-
a) preparing DNA/RNA genome libraries from collected metagenome samples at a single bioparticle sensitivity, wherein a radiation-based reagent-decontamination protocol is implemented to remove any traces of amplifiable nucleotides from reagents, to wherein a sterilized hydrophobic or hydrophilic filter membrane is used to collect nano-gram-level amount of airborne biological particulates at the minimum in between 12 to 72 hours, wherein, with an amount range above 1 ng of nucleotides of airborne biological samples, extracting DNA and RNA concurrently from the collected samples of any nature by a chemistry/physics-based pipeline at a single bacterial particle sensitivity, wherein a non-PCR-based linear amplification technique is applied for DNA and RNA after converted to complementary DNA, and wherein, with limited RNAs, depleting most ribosome RNA with selective primers for preferential enrichment of non-ribosome RNA sequences; and (b) using cocktail enzyme to fragmentize DNA/cDNA generated to minimize the loss of nucleotide materials.
-
-
2. A pan-domain species detection bioinformatics pipeline for next generation sequencing data, comprising:
-
(a) building a non-redundant genome sequence database incorporating publicly available genome sequences and/or select organisms of interest; (b) building a taxonomy database to store taxonomy information for every sequence in the database; (c) creating de-duplicated sequencing reads in fastq format by removing exact paired-duplicated reads from raw sequencing reads; (d) trimming the de-duplicated sequencing reads; (e) mapping the trimmed de-duplicated sequencing reads to a human reference genome therewith creating trimmed de-duplicated non-human reads in fastq format; (f) creating assembled contigs from the trimmed de-duplicated non-human reads in fastq format by a rigorous reads assembly method with specific parameters to minimize chimeric assembly; (g) classifying the assembled contigs by searching against the on-redundant genome sequence database; and (h) outputting the classified results; wherein the pipeline steps (a)-(h) are executed by a computer or one or more computer processors.
-
Specification