SYSTEMS AND METHODS FOR HYBRID ASSEMBLY OF NUCLEIC ACID SEQUENCES
First Claim
1. A computer implemented method for assembling a nucleic acid sequence, comprising:
- receiving, into a memory, a plurality of single fragment sequence reads and a plurality of paired fragment sequence reads, each paired fragment sequence read comprising at least two sequence reads separated by an insert;
assembling the single fragment sequence reads into a plurality of contigs;
mapping the paired fragment sequence reads to the contigs;
identifying a gap region comprising a portion of the partially assembled nucleic acid sequence for which the single fragment sequence reads do not map, andutilizing hanging pairwise sequence reads of the mapped paired fragment sequence reads to fill in the gap region using a processor.
1 Assignment
0 Petitions
Accused Products
Abstract
Systems and methods for assembling a nucleic acid sequence are disclosed. A plurality of single fragment sequence reads and a plurality of paired fragment sequence reads are received. Each paired fragment sequence read comprises at least two sequence reads separated by an insert. Single fragment sequence reads are assembled into a plurality of contigs, and the paired fragment sequence reads are mapped to the contigs. Further, gap regions comprising a portion of the partially assembled nucleic acid sequence for which the single fragment sequence reads do not map are identified, and hanging pairwise sequence reads of the mapped paired fragment sequence reads are used to fill in the gap region.
-
Citations
20 Claims
-
1. A computer implemented method for assembling a nucleic acid sequence, comprising:
-
receiving, into a memory, a plurality of single fragment sequence reads and a plurality of paired fragment sequence reads, each paired fragment sequence read comprising at least two sequence reads separated by an insert; assembling the single fragment sequence reads into a plurality of contigs; mapping the paired fragment sequence reads to the contigs; identifying a gap region comprising a portion of the partially assembled nucleic acid sequence for which the single fragment sequence reads do not map, and utilizing hanging pairwise sequence reads of the mapped paired fragment sequence reads to fill in the gap region using a processor. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A system for assembling a nucleic acid sequence, comprising:
a computing device, including; a contig assembly engine configured to assemble single fragment sequence reads into one or more contigs; a mapping engine configured to map a plurality of paired fragment sequence reads to the assembled contigs, each paired fragment sequence read comprising at least two sequence reads separated by an insert; a scaffolding engine configured to form a sequence scaffold from the mapped paired fragment sequence reads and contigs; and a gap-filling engine configured to utilize hanging pairwise sequences of the mapped paired fragment sequence reads to fill in gap regions in the sequence scaffold. - View Dependent Claims (10, 11, 12, 13, 14, 15)
-
16. A non-transitory computer readable media having a computer readable program code embodied therein, the computer readable program code adapted to be executed by a processor to implement a method for annotating called variants in a sample genome, comprising:
-
receiving a plurality of single fragment sequence reads and a plurality of paired fragment sequence reads, each paired fragment sequence read comprising at least two sequence reads separated by an insert; assembling the single fragment sequence reads into a plurality of contigs; mapping the paired fragment sequence reads to the contigs; identifying a gap region comprising a portion of the partially assembled nucleic acid sequence for which the single fragment sequence reads do not map; and utilizing hanging pairwise sequence of the mapped paired fragment sequence reads to fill in the gap region. - View Dependent Claims (17, 18, 19, 20)
-
Specification