Methods for high throughput genotyping
First Claim
1. A method for calling the genotype of a sample at a selected polymorphism in a sample using a genotyping array, comprising:
- (a) obtaining intensity measurements for allele A and for allele B for a plurality of polymorphisms in a plurality of training samples, wherein the genotype of each polymorphism in the plurality in each training sample is of known genotype, wherein the intensity measurements represent intensity of signal associated with one or more features on said genotyping array;
(b) making a genotype call for each of a said polymorphisms in each of the training samples using the intensity measurements for allele A and for allele B obtained in (a);
(c) comparing the genotype call with the known genotype to identify individuals where the correct genotype call was made;
(d) using the intensity measurements from the individuals identified in (c) to calculate a ratio of intensity measurement for allele A to intensity measurement for allele B, for the training samples for each sub-group of AA, AB and BB to obtain an AA reference ratio, an AB reference ratio and a BB reference ratio for each of said polymorphisms;
(e) hybridizing a test sample to the genotyping array to obtain hybridization intensity values for the A allele and for the B allele for each of said polymorphisms in the test sample;
(f) calculating a ratio of the intensity measurement for the A allele to the B allele for each of said polymorphisms in the test sample and comparing the ratio to the reference ratios for AA, AB and BB obtained for that polymorphism in (d) to determine the likelihood that the polymorphism is AB;
(g) identifying a subset of the polymorphisms in the test sample that are likely to be AB, wherein a polymorphism is identified as being likely to be AB if the likelihood that the polymorphism is AB is greater than a selected threshold;
(h) adjusting the intensity measurement of the B allele by the reference ratio for the AB group for that polymorphism from the training set to obtain an adjusted intensity measurement for the B allele, for each polymorphism in the subset of polymorphisms identified in (g); and
(i) generating a genotype call for each of the polymorphisms identified in (g) using the adjusted intensity measurement for the B allele.
5 Assignments
0 Petitions
Accused Products
Abstract
Methods for genotyping polymorphisms using allele specific probes are disclosed. A training set is used to generate a model for each polymorphism to be interrogated. The training set is used to obtain an estimate of the asymmetry between an intensity measurement for a first allele and an intensity measurement for a second allele of the same polymorphism. The intensity measurement obtained for a test sample is adjusted using the estimate of asymmetry prior to using the intensity measurements to make a genotyping call. In preferred embodiments the adjustment is applied to polymorphisms that have a likelihood of being heterozygous that is above a specified threshold.
-
Citations
16 Claims
-
1. A method for calling the genotype of a sample at a selected polymorphism in a sample using a genotyping array, comprising:
-
(a) obtaining intensity measurements for allele A and for allele B for a plurality of polymorphisms in a plurality of training samples, wherein the genotype of each polymorphism in the plurality in each training sample is of known genotype, wherein the intensity measurements represent intensity of signal associated with one or more features on said genotyping array; (b) making a genotype call for each of a said polymorphisms in each of the training samples using the intensity measurements for allele A and for allele B obtained in (a); (c) comparing the genotype call with the known genotype to identify individuals where the correct genotype call was made; (d) using the intensity measurements from the individuals identified in (c) to calculate a ratio of intensity measurement for allele A to intensity measurement for allele B, for the training samples for each sub-group of AA, AB and BB to obtain an AA reference ratio, an AB reference ratio and a BB reference ratio for each of said polymorphisms; (e) hybridizing a test sample to the genotyping array to obtain hybridization intensity values for the A allele and for the B allele for each of said polymorphisms in the test sample; (f) calculating a ratio of the intensity measurement for the A allele to the B allele for each of said polymorphisms in the test sample and comparing the ratio to the reference ratios for AA, AB and BB obtained for that polymorphism in (d) to determine the likelihood that the polymorphism is AB; (g) identifying a subset of the polymorphisms in the test sample that are likely to be AB, wherein a polymorphism is identified as being likely to be AB if the likelihood that the polymorphism is AB is greater than a selected threshold; (h) adjusting the intensity measurement of the B allele by the reference ratio for the AB group for that polymorphism from the training set to obtain an adjusted intensity measurement for the B allele, for each polymorphism in the subset of polymorphisms identified in (g); and (i) generating a genotype call for each of the polymorphisms identified in (g) using the adjusted intensity measurement for the B allele. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A system for calling the genotype of a sample comprising:
-
a computer comprising system memory with executable code stored thereon, wherein the executable code is enabled to perform a method, comprising; (a) receiving intensity measurements for a plurality of probe features for a plurality of individuals in a training set and for at least one test individual, wherein the intensity measurements are a measure of the amount of a fluorescent signal associated with a feature; (b) calculating summary ratios for each of a plurality of polymorphisms in a training set of samples of known genotype at each of said polymorphisms using said intensity measurements in (a), wherein for each polymorphism each individual in the training set is placed into one of three groups selected from homozygous for a first allele, homozygous for a second allele or heterozygous, wherein a summary ratio is calculated for each group for each polymorphism, and wherein said summary ratio is calculated from the ratio of the intensity measurement for the first allele to the intensity measurement for the second allele; (c) calculating a ratio of the intensity measurement for the first allele to the intensity measurement for the second allele for each of the polymorphisms in said test individual; (d) comparing the value obtained in (c) to the values obtained in (b) for each polymorphism to determine a likelihood that a given polymorphism is heterozygous in said test individual; (e) adjusting the intensity measurement for the first allele in said test individual using the summary ratio for the heterozygous group obtained in (b) for those polymorphisms in (d) wherein the likelihood that the polymorphism is heterozygous in said test individual is greater than a threshold value, to obtain adjusted intensity measurements; and (f) outputting a data file of adjusted and unadjusted intensity measurements on a computer readable medium. - View Dependent Claims (9, 10, 11, 12)
-
-
13. A method for calling the genotype of a sample at a selected polymorphism in a sample using a genotyping array, comprising:
-
(a) obtaining hybridization intensity values for said genotyping array for each of a set of training samples comprising a plurality of training samples of known genotype; (b) making a genotype call for each of a plurality of single nucleotide polymorphisms (SNPs) in each of the training samples using the hybridization intensity values from individual probe quartets; (c) comparing the genotype call with the known genotype for each probe quartet to identify a plurality of K best probe quartets for each SNP , where K is at least 1, wherein probe quartets are selected as best probe quartets if the genotype call made using said quartet has high concordance with the known genotype for that SNP; (d) calculating a distribution of (intensity A)/(intensity B) for the training samples for each sub-group of AA, AB and BB to obtain an AA reference distribution, an AB reference distribution and a BB reference distribution; (e) hybridizing a test sample to the genotyping array to obtain hybridization intensity values for said K best probe quartets; (f) calculating (intensity A)/(intensity B) for each quartet and comparing with the reference distributions for AA, AB and BB to determine the likelihood that the polymorphism is AB; (g) if the likelihood that the polymorphism is greater than a selected threshold, adjusting the intensity of intensity B by the (intensity A)/(intensity B) ratio from the AB group from the reference set to obtain an adjusted allele B intensity; and (h) using the adjusted intensity B value to generate a genotype call using a selected algorithm. - View Dependent Claims (14, 15, 16)
-
Specification