Methods for genomic analysis
First Claim
Patent Images
1. A method for constructing a set of SNP haplotype patterns for data analysis, comprising:
- a) providing a data set of SNP haplotype sequences;
b) providing a pattern set of SNP haplotype patterns; and
c) comparing the SNP haplotype sequences from said data set to the SNP haplotype patterns from said pattern set, wherein if said SNP haplotype sequence being compared is not consistent with any of said SNP haplotype patterns, then said SNP haplotype sequence is added to said pattern set.
1 Assignment
0 Petitions
Accused Products
Abstract
The present invention relates to methods for identifying variations that occur in the human genome, relating these variations to one another, and, ultimately, relating these variations to the genetic bases of phenotype such as disease resistance, disease susceptibility or drug response. The methods allow for, once variants have been identified, determining variant haplotype blocks and patterns, and further, resolving ambiguities in the haplotype block and pattern data sets.
-
Citations
13 Claims
-
1. A method for constructing a set of SNP haplotype patterns for data analysis, comprising:
-
a) providing a data set of SNP haplotype sequences;
b) providing a pattern set of SNP haplotype patterns; and
c) comparing the SNP haplotype sequences from said data set to the SNP haplotype patterns from said pattern set, wherein if said SNP haplotype sequence being compared is not consistent with any of said SNP haplotype patterns, then said SNP haplotype sequence is added to said pattern set. - View Dependent Claims (2, 3, 4)
-
-
5. A method for building a pattern set of SNP haplotype patterns comprising:
-
a) providing a data set of SNP haplotype sequences;
b) providing a pattern set of SNP haplotype patterns;
c) comparing said SNP haplotype sequences from said data set to said SNP haplotype patterns from said pattern set, wherein if said SNP haplotype sequence being compared is not consistent with any of said SNP haplotype patterns, then said SNP haplotype sequence is added to said pattern set, wherein if said SNP haplotype sequence being compared is consistent with more than one SNP haplotype pattern, then said SNP haplotype sequence is not added to said data set, and wherein if said SNP haplotype sequence being compared is consistent with one SNP haplotype pattern, then d) determining a number of ambiguities in said SNP haplotype sequence that matches one SNP haplotype pattern;
e) determining a number of ambiguities in said consistent SNP haplotype pattern;
f) resolving said ambiguities of either said SNP haplotype sequence or said SNP haplotype pattern by using information from either said SNP haplotype pattern or said SNP haplotype sequence at each SNP location for which one of either said SNP haplotype pattern or said SNP haplotype sequence contains unambiguous information to create a resolved SNP haplotype sequence; and
g) adding said resolved SNP haplotype sequence into said pattern set. - View Dependent Claims (6)
-
-
7. A computer readable medium capable of comparing SNP haplotype sequences from a data set to SNP haplotype patterns from a pattern set, wherein if said SNP haplotype sequence being compared is not consistent with any of said SNP haplotype patterns, then said SNP haplotype sequence is added to said pattern set, and wherein if said SNP haplotype sequence being compared is consistent with at least one SNP haplotype pattern, then said SNP haplotype sequence is not added to said pattern set.
-
8. A method for building a data set of SNP haplotype sequences and a pattern set of SNP haplotype patterns comprising:
-
a) providing a data set comprising a number N of SNP haplotype sequences;
b) providing a pattern set comprising SNP haplotype patterns;
c) comparing sequentially each SNP haplotype sequence from said data set to said SNP haplotype patterns from said pattern set, wherein if said SNP haplotype sequence being compared does not match any of said SNP haplotype patterns, then said SNP haplotype sequence is added to said pattern set and retained in said data set, wherein if said SNP haplotype sequence being compared matches more than one SNP haplotype pattern, then said SNP haplotype sequence is not added to said pattern set and is retained in said data set, and wherein if said SNP haplotype sequence being compared matches one SNP haplotype pattern, then d) determining a number of ambiguities in said SNP haplotype sequence that matches one SNP haplotype pattern;
e) determining a number of ambiguities in said consistent SNP haplotype pattern;
f) resolving said ambiguities of either said SNP haplotype sequence or said SNP haplotype pattern by using information from either said SNP haplotype pattern or said SNP haplotype sequence at each SNP location for which one of either said SNP haplotype pattern or said SNP haplotype sequence contains unambiguous information to create a resolved SNP haplotype sequence;
h) adding said resolved sequence into said pattern set; and
i) performing said steps sequentially on each of said number N of SNP haplotype sequences. - View Dependent Claims (9, 10, 11)
-
-
12. A computer readable medium capable of
a) comparing sequentially each SNP haplotype sequence from said data set to said SNP haplotype patterns from said pattern set, wherein if said SNP haplotype sequence being compared is not consistent with any of said SNP haplotype patterns, then said SNP haplotype sequence is added to said pattern set and retained in said data set, wherein if said SNP haplotype sequence being compared is consistent with more than one SNP haplotype pattern, then said SNP haplotype sequence is not added to said pattern set and is retained in said data set, and wherein if said SNP haplotype sequence being compared is consistent with one SNP haplotype pattern, then b) identifying ambiguities in said consistent SNP haplotype pattern; -
c) identifying nonambiguous SNP locations in said first SNP haplotype sequence that is consistent with one SNP haplotype pattern;
d) resolving said ambiguities of said SNP haplotype pattern in said pattern set by using nonambiguous information from said first SNP haplotype sequence;
e) performing said steps sequentially on each of said number N of SNP haplotype sequences.
-
-
13. A method for building a final SNP haplotype block set of nonoverlapping SNP haplotype blocks that include all SNP positions in a set of overlapping SNP haplotype blocks, comprising:
-
a) analyze the informativeness of a first haplotype block;
b) if the informativeness of said first haplotype block is above a determined threshold level, add said first haplotype block to a candidate SNP block set;
c) repeat steps a) and b) for all potential haplotype blocks for a given SNP haplotype sequence to generate a set of candidate SNP blocks;
d) compare the informativeness of each of the candidate SNP blocks;
e) choose the candidate SNP haplotype block with the highest informativeness for inclusion in a final SNP haplotype block set;
f) remove said candidate SNP haplotype block with the highest informativeness from said candidate SNP haplotype block set;
g) remove all candidate SNP haplotype blocks that overlap with said candidate SNP haplotype block with the highest informativeness from said candidate SNP haplotype block set;
h) repeat steps e), f) and g) until there are no candidate SNP haplotype blocks left in the candidate SNP haplotype block set and a final SNP haplotype block set of nonoverlapping SNP haplotype blocks that include all SNP positions in a set of overlapping SNP haplotype blocks has been generated.
-
Specification