Methods for accurate sequence data and modified base position determination
First Claim
1. A system comprising a sequencing apparatus operably linked to a computing apparatus comprising a processor, non-transitory computer readable storage medium, bus system, and at least one user interface element, the non-transitory computer-readable storage medium being encoded with programming comprising an operating system, user interface software, and instructions that, when executed by the processor, optionally with user input, perform a method comprising:
- a. obtaining sequence data from a circular nucleic acid molecule comprising at least one insert-sample unit comprising a nucleic acid insert and a nucleic acid sample, wherein;
(i) the insert has a known sequence,(ii) the sequence data comprise sequences from at least two insert-sample units, including at least two repeats of the sequence of the nucleic acid sample, and(iii) a nucleic acid molecule is produced that comprises at least two insert-sample units;
b. calculating scores of the sequences of at least two inserts of the sequence data of step (a) by comparing the sequences to the known sequence of the insert;
c. accepting or rejecting at least two of the repeats of the sequence of the nucleic acid sample of the sequence data of step (a) according to the scores of one or both of the sequences of the inserts immediately upstream and downstream of the repeat of the sequence of the nucleic acid sample;
d. compiling an accepted sequence set comprising at least one repeat of the sequence of the nucleic acid sample accepted in step (c); and
e. determining the sequence of the nucleic acid sample using the accepted sequence set,wherein an output of the system is used to produce at least one of (i) a sequence of a nucleic acid sample or (ii) an indication that there is a modified base in at least one position in a nucleic acid sample.
1 Assignment
0 Petitions
Accused Products
Abstract
Disclosed herein are methods of determining the sequence and/or positions of modified bases in a nucleic acid sample present in a circular molecule with a nucleic acid insert of known sequence comprising obtaining sequence data of at least two insert-sample units. In some embodiments, the methods comprise obtaining sequence data using circular pair-locked molecules. In some embodiments, the methods comprise calculating scores of sequences of the nucleic acid inserts by comparing the sequences to the known sequence of the nucleic acid insert, and accepting or rejecting repeats of the sequence of the nucleic acid sample according to the scores of one or both of the sequences of the inserts immediately upstream or downstream of the repeats of the sequence of the nucleic acid sample.
-
Citations
2 Claims
-
1. A system comprising a sequencing apparatus operably linked to a computing apparatus comprising a processor, non-transitory computer readable storage medium, bus system, and at least one user interface element, the non-transitory computer-readable storage medium being encoded with programming comprising an operating system, user interface software, and instructions that, when executed by the processor, optionally with user input, perform a method comprising:
-
a. obtaining sequence data from a circular nucleic acid molecule comprising at least one insert-sample unit comprising a nucleic acid insert and a nucleic acid sample, wherein; (i) the insert has a known sequence, (ii) the sequence data comprise sequences from at least two insert-sample units, including at least two repeats of the sequence of the nucleic acid sample, and (iii) a nucleic acid molecule is produced that comprises at least two insert-sample units; b. calculating scores of the sequences of at least two inserts of the sequence data of step (a) by comparing the sequences to the known sequence of the insert; c. accepting or rejecting at least two of the repeats of the sequence of the nucleic acid sample of the sequence data of step (a) according to the scores of one or both of the sequences of the inserts immediately upstream and downstream of the repeat of the sequence of the nucleic acid sample; d. compiling an accepted sequence set comprising at least one repeat of the sequence of the nucleic acid sample accepted in step (c); and e. determining the sequence of the nucleic acid sample using the accepted sequence set, wherein an output of the system is used to produce at least one of (i) a sequence of a nucleic acid sample or (ii) an indication that there is a modified base in at least one position in a nucleic acid sample.
-
-
2. A non-transitory computer readable storage medium encoded with programming comprising an operating system, user interface software, and instructions that, when executed by the processor on a system comprising a sequencing apparatus operably linked to a computing apparatus comprising a processor, non-transitory computer readable storage medium, bus system, and at least one user interface element, optionally with user input, perform a method comprising:
-
a. obtaining sequence data from a circular nucleic acid molecule comprising at least one insert-sample unit comprising a nucleic acid insert and a nucleic acid sample, wherein; (i) the insert has a known sequence, (ii) the sequence data comprise sequences from at least two insert-sample units, including at least two repeats of the sequence of the nucleic acid sample, and (iii) a nucleic acid molecule is produced that comprises at least two insert-sample units; b. calculating scores of the sequences of at least two inserts of the sequence data of step (a) by comparing the sequences to the known sequence of the insert; c. accepting or rejecting at least two of the repeats of the sequence of the nucleic acid sample of the sequence data of step (a) according to the scores of one or both of the sequences of the inserts immediately upstream and downstream of the repeat of the sequence of the nucleic acid sample; d. compiling an accepted sequence set comprising at least one repeat of the sequence of the nucleic acid sample accepted in step (c); and e. determining the sequence of the nucleic acid sample using the accepted sequence set, wherein the method results in output used to produce at least one of (i) a sequence of a nucleic acid sample or (ii) an indication that there is a modified base in at least one position in a nucleic acid sample.
-
Specification