Fragmentation-based methods and systems for sequence variation detection and discovery
First Claim
Patent Images
1. A method for determining the sequence of one or more sequence variations in a target nucleic acid relative to a reference sequence, comprising:
- (a) generating mass signals for target nucleic acid fragments by mass spectrometry, wherein the target nucleic acid fragments result from a specific cleavage reaction of the target nucleic acid;
(b) generating or simulating mass signals for reference fragments, wherein the reference fragments result from cleavage or simulated cleavage of the reference sequence using the same specific cleavage reaction in (a);
(c) identifying mass signals in the target nucleic acid fragment spectrum that are different relative to the reference fragment spectrum, thereby identifying different target nucleic acid fragments;
(d) generating one or more compomer witnesses corresponding to each different target nucleic acid fragment identified in (c);
(e) selecting, from the set of all possible subsequences of the reference sequence, a subset of subsequences having at most k cleavage points for the specific cleavage reaction, wherein k is user-defined;
(f) generating for each compomer witness in (d) all possible sequence variations of one or more subsequences in the subset selected in (e) that would lead to the compomer witness, thereby identifying a reduced set of candidate sequence variations; and
(g) scoring the candidate sequence variations identified in (f) to determine the sequence of the one or more sequence variations in the target nucleic acid.
8 Assignments
0 Petitions
Accused Products
Abstract
Fragmentation-based methods and systems, particularly mass spectrometric methods and systems, for the analysis of sequence variations are provided.
-
Citations
62 Claims
-
1. A method for determining the sequence of one or more sequence variations in a target nucleic acid relative to a reference sequence, comprising:
-
(a) generating mass signals for target nucleic acid fragments by mass spectrometry, wherein the target nucleic acid fragments result from a specific cleavage reaction of the target nucleic acid; (b) generating or simulating mass signals for reference fragments, wherein the reference fragments result from cleavage or simulated cleavage of the reference sequence using the same specific cleavage reaction in (a); (c) identifying mass signals in the target nucleic acid fragment spectrum that are different relative to the reference fragment spectrum, thereby identifying different target nucleic acid fragments; (d) generating one or more compomer witnesses corresponding to each different target nucleic acid fragment identified in (c); (e) selecting, from the set of all possible subsequences of the reference sequence, a subset of subsequences having at most k cleavage points for the specific cleavage reaction, wherein k is user-defined; (f) generating for each compomer witness in (d) all possible sequence variations of one or more subsequences in the subset selected in (e) that would lead to the compomer witness, thereby identifying a reduced set of candidate sequence variations; and (g) scoring the candidate sequence variations identified in (f) to determine the sequence of the one or more sequence variations in the target nucleic acid. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36)
-
-
37. A method for detecting a sequence variation in a target nucleic acid, comprising:
-
(a) generating mass signals for target nucleic acid fragments and reference nucleic acid fragments by mass spectrometry, wherein the target nucleic acid fragments and the reference nucleic acid fragments result from cleavage of the target nucleic acid and reference nucleic acid by two or more specific cleavage reactions; (b) identifying, for at least two of the two or more specific cleavage reactions, mass signals in the target nucleic acid fragment spectrum that are different relative to the reference fragment spectrum, thereby identifying different target nucleic acid fragments; (c) identifying different target nucleic acid fragments in each of the at least two specific cleavage reactions that are consistent with the sequence variation in the target nucleic acid, thereby identifying consistent different fragments; (d) combining the consistent different fragments of (c) to obtain set of consistent different fragments; (e) generating for each of the consistent different fragments of (d) one or more compomer witnesses; (f) determining a reduced set of sequence variation candidates corresponding to the compomer witnesses; and (g) scoring the sequence variation candidates of (f) to determine the presence or absence of the sequence variation in the target. - View Dependent Claims (38, 39, 40, 41, 42, 43, 44)
-
-
45. A method for detecting one or more sequence variations in a target nucleic acid, comprising:
-
(a) providing reference sequence s for the target nucleic acid sequence, a description of cleavage reaction conditions, and maximal sequence variation order k; (b) determining for reference sequence s all subsequences s[i,j] in set Ck, wherein s[i,j] represents a subsequence of reference sequence s beginning at position i and ending at position j, wherein Ck is described by Ck;
={(c[i, j], b[i, j]);
1≦
i≦
j≦
length of s, and ord[i, j]+#b[i, j]≦
k},wherein c[i,j] represents the compomer corresponding to s[i,j] wherein b[i,j] represents the boundary corresponding to s[i,j], wherein ord[i,j] is the number of times s[i,j] is cleaved under the cleavage reaction conditions, wherein #b[i,j] is the value of b[i,j], wherein; #b[i,j]=2 if s is neither cleaved directly before i nor after j, #b[i,j]=1 if s is cleaved either directly before i or directly after j, but s is not cleaved directly before i and directly after j, and #b[i,j]=0 if s is cleaved directly before i and directly after j; (c) generating mass signals for target nucleic acid fragments by mass spectrometry, wherein the target nucleic acid fragments result from the cleavage reaction conditions in (a); (d) generating or simulating mass signals for reference sequence fragments, wherein the reference sequence fragments result from reference sequence s using the cleavage reaction conditions in (a); (e) identifying mass signals in the target nucleic acid fragment spectrum that are different relative to the reference fragment spectrum, thereby identifying different target nucleic acid fragments; (f) generating one or more compomer witnesses c′
corresponding to each different target nucleic acid fragment identified in (e);(g) for every compomer witness c′
, identifying all c[i,j] in Ck such that D(c′
,c,b)≦
k, wherein D(c′
,c,b) is the minimum number of nucleotide insertions, deletions, substitutions or modifications relative to the reference sequence needed to generate the compomer witness c′
from c[i,j];(h) for every compomer c[i,j] identified in (g), determining all sequence variations using at most k-#b insertions, deletions, substitutions or modifications that transform c into c′
, thereby identifying a reduced set of candidate sequence variations; and(i) scoring the reduced set of candidate sequence variations to detect the one or more sequence variations in the target nucleic acid from the reduced set of candidate sequence variations. - View Dependent Claims (46, 47, 48, 49, 50, 51, 52)
-
-
53. A method of determining a reduced set of sequence variation candidates in a target nucleic acid relative to a reference sequence, comprising:
-
(a) generating mass signals for target nucleic acid fragments by mass spectrometry, wherein the target nucleic acid fragments result from a specific cleavage reaction of the target nucleic acid; (b) generating or simulating mass signals for reference fragments, wherein the reference fragments result from cleavage or simulated cleavage of the reference sequence using the same specific cleavage reaction in (a); (c) identifying mass signals in the target nucleic acid fragment spectrum that are different relative to the reference fragment spectrum, thereby identifying different target nucleic acid fragments; (d) generating one or more compomer witnesses corresponding to each different target nucleic acid fragment identified in (c); (e) selecting, from the set of all possible subsequences of the reference sequence, a subset of subsequences having at most k cleavage points for the specific cleavage reaction; and (f) generating for each compomer witness in (d) all possible sequence variations of one or more subsequences in the subset selected in (e) that would lead to the compomer witness, thereby identifying a reduced set of candidate sequence variations in the target nucleic acid. - View Dependent Claims (54, 55, 56, 57, 58, 59, 60, 61, 62)
-
Specification