Methods for obtaining and using haplotype data
First Claim
1. A method of inferring a pair of haplotypes present in an individual for a locus having at least m polymorphic sites, comprising(a) providing a database comprising (i) haplotypes of the locus for a set of m polymorphic sites present in a reference population and (ii) a frequency of occurrence in the reference population of each haplotype;
- (b) constructing a list of all genotypes that could result from all possible pairs of the haplotypes in the database for the reference population;
(c) calculating a frequency for each genotype on the list, wherein the frequency of a given genotype is a function of a frequency for a pair of haplotypes from which the given genotype results, calculated assuming Hardy-Weinberg equilibrium;
(d) generating a set of all possible masks for the m polymorphic sites, wherein each mask blocks the identity of the nucleotides present at m-n of the polymorphic sites and admits the identity of nucleotides present at the other n polymorphic sites;
(e) for each mask, calculating the ambiguity resulting from genotyping only the n polymorphic sites at which nucleotide identity is admitted by the mask;
(f) from among those masks having zero ambiguity, selecting a mask which has the lowest value of n;
(g) determining the genotype of the individual at the n polymorphic sites at which nucleotide identity is admitted by the selected mask; and
(h) assigning to the individual a pair of m polymorphic site haplotypes in the database for the reference population by matching the individual'"'"'s determined genotype to a genotype on the list of genotypes, at the n polymorphic sites at which nucleotide identity is admitted by the selected mask.
0 Assignments
0 Petitions
Accused Products
Abstract
Methods, computer program(s) and database(s) to analyze and make use of gene haplotype information. These include methods, program, and database to find and measure the frequency of haplotypes in the general population; methods, program, and database to find correlation'"'"'s between an individual'"'"'s haplotypes or genotypes and a clinical outcome; methods, program, and database to predict an individual'"'"'s haplotypes from the individual'"'"'s genotype for a gene; and methods, program, and database to predict an individual'"'"'s clinical response to a treatment based on the individual'"'"'s genotype or haplotype.
-
Citations
4 Claims
-
1. A method of inferring a pair of haplotypes present in an individual for a locus having at least m polymorphic sites, comprising
(a) providing a database comprising (i) haplotypes of the locus for a set of m polymorphic sites present in a reference population and (ii) a frequency of occurrence in the reference population of each haplotype; -
(b) constructing a list of all genotypes that could result from all possible pairs of the haplotypes in the database for the reference population; (c) calculating a frequency for each genotype on the list, wherein the frequency of a given genotype is a function of a frequency for a pair of haplotypes from which the given genotype results, calculated assuming Hardy-Weinberg equilibrium; (d) generating a set of all possible masks for the m polymorphic sites, wherein each mask blocks the identity of the nucleotides present at m-n of the polymorphic sites and admits the identity of nucleotides present at the other n polymorphic sites; (e) for each mask, calculating the ambiguity resulting from genotyping only the n polymorphic sites at which nucleotide identity is admitted by the mask; (f) from among those masks having zero ambiguity, selecting a mask which has the lowest value of n; (g) determining the genotype of the individual at the n polymorphic sites at which nucleotide identity is admitted by the selected mask; and (h) assigning to the individual a pair of m polymorphic site haplotypes in the database for the reference population by matching the individual'"'"'s determined genotype to a genotype on the list of genotypes, at the n polymorphic sites at which nucleotide identity is admitted by the selected mask.
-
-
2. A computer-usable medium having computer-readable program code stored thereon, for causing a computer to infer a pair of haplotypes present in an individual for a locus having at least m polymorphic sites, the computer-readable program code comprising:
-
(a) computer-readable program code for causing a computer to access a database comprising (i) haplotypes of the locus for a set of m polymorphic sites present in a reference population and (ii) a frequency of occurrence in the reference population of each haplotype; (b) computer-readable program code for causing a computer to construct a list of all genotypes that could result from all possible pairs of the haplotypes in the database for the reference population; (c) computer-readable program code for causing a computer to calculate a frequency for each genotype on the list, wherein the frequency of a given genotype is a function of a frequency for a pair of haplotypes from which the given genotype results, calculated assuming Hardy-Weinberg equilibrium; (d) computer-readable program code for causing a computer to generate a set of all possible masks for the m polymorphic sites, wherein each mask blocks the identity of the nucleotides present at m-n of the polymorphic sites and admits the identity of nucleotides present at the other n polymorphic sites; (e) computer-readable program code for causing a computer to calculate, for each mask, the ambiguity resulting from genotyping with only the n polymorphic sites at which nucleotide identity is admitted by the mask; (f) computer-readable program code for causing a computer to output or display on a display device the calculated ambiguity for one or more masks and permitting the operator to select a mask; (g) computer-readable program code for causing a computer to accept as input an individual'"'"'s genotype data at the n polymorphic sites of the selected mask; and (h) computer-readable program code for causing a computer to assign to the individual a pair of m polymorphic site haplotypes in the database for the reference population by matching the individual'"'"'s determined genotype to a genotype on the list of genotypes at the n polymorphic sites at which nucleotide identity is admitted by the selected mask. - View Dependent Claims (3, 4)
-
Specification