Method and system for determining haplotypes from a collection of polymorphisms
First Claim
Patent Images
1. A method for assigning haplotype pairs for a polymorphic genomic region to a plurality of individuals, comprising:
- (a) obtaining a genotype for the polymorphic genomic region from each of the individuals;
(b) enumerating all possible haplotypes hi that are consistent with each genotype;
(c) assigning an evidence score si to each of the enumerated haplotypes hi;
(d) calculating an initial haplotype frequency fi for each haplotype among the possible haplotypes, wherein the initial haplotype frequency fi is a function of the evidence score si;
(e) determining for each genotype obtained in step (a) a pair score Fk for each pair of haplotypes that is consistent with that genotype, wherein Fk is a function of the frequency fi for each of the haplotypes in the pair;
(f) calculating, for each genotype and consistent haplotype pair whose pair score Fk meets a pair score criterion, a probability pk that assignment of that haplotype pair to the genotype would be correct;
(g) generating a revised haplotype frequency fi for each haplotype, wherein the revised haplotype frequency fi is a function of the probability pk for each consistent haplotype pair which contains the haplotype; and
(h) repeating steps (e) through (g) until an end condition is reached, with the proviso that for each repetition the frequency fi employed in step (e) is replaced by the revised frequency fi determined in step (g).
1 Assignment
0 Petitions
Accused Products
Abstract
Methods, computer programs and databases for determining haplotypes from a collection of polymorphisms are provided. These include methods, programs, and databases to find and measure the frequency of haplotypes in the general population; and methods, programs, and databases for predicting an individual'"'"'s haplotypes from the individual'"'"'s genotype for a gene.
-
Citations
80 Claims
-
1. A method for assigning haplotype pairs for a polymorphic genomic region to a plurality of individuals, comprising:
-
(a) obtaining a genotype for the polymorphic genomic region from each of the individuals;
(b) enumerating all possible haplotypes hi that are consistent with each genotype;
(c) assigning an evidence score si to each of the enumerated haplotypes hi;
(d) calculating an initial haplotype frequency fi for each haplotype among the possible haplotypes, wherein the initial haplotype frequency fi is a function of the evidence score si;
(e) determining for each genotype obtained in step (a) a pair score Fk for each pair of haplotypes that is consistent with that genotype, wherein Fk is a function of the frequency fi for each of the haplotypes in the pair;
(f) calculating, for each genotype and consistent haplotype pair whose pair score Fk meets a pair score criterion, a probability pk that assignment of that haplotype pair to the genotype would be correct;
(g) generating a revised haplotype frequency fi for each haplotype, wherein the revised haplotype frequency fi is a function of the probability pk for each consistent haplotype pair which contains the haplotype; and
(h) repeating steps (e) through (g) until an end condition is reached, with the proviso that for each repetition the frequency fi employed in step (e) is replaced by the revised frequency fi determined in step (g). - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 38)
-
-
28. A method for predicting an individual'"'"'s haplotype pair for a polymorphic genomic region, comprising
(a) identifying a genotype for the individual; -
(b) enumerating all possible haplotype pairs which are consistent with the genotype;
(c) determining a probability for each possible haplotype pair that the individual has that possible haplotype pair by accessing a database containing frequency data for reference haplotype pairs; and
(d) analyzing the determined probabilities to predict an individual'"'"'s haplotype pair. - View Dependent Claims (29, 30)
-
-
31. A computer implemented method for generating haplotype pair and haplotype frequency screens for display on a display device, comprising the steps of:
-
(a) displaying in a first area a plurality of selectable items each corresponding to a polymorphic site for a predetermined gene;
(b) selecting one or more of said selectable items;
(c) displaying in a second area the haplotype pairs occurring in a reference population for the selected polymorphic sites;
(d) displaying in a third area data indicative of haplotype frequencies for a plurality of member groupings within the population.
-
-
32. A computer system for assigning haplotype pairs for a polymorphic genomic region to a plurality of individuals, comprising:
-
a database for storing genotyping information;
a processor connected to the database;
a computer program for controlling the processor connected to said database comprising instruction code to;
(a) accept input of a genotype for the polymorphic genomic region from each of the individuals and store said genotype within said database;
(b) enumerate all possible haplotypes hi consistent with each genotype and store said haplotypes hi within said database;
(c) calculate an evidence score si for each of said possible haplotypes hi and store said evidence score si within said database;
(d) calculate an initial haplotype frequency fi for each haplotype hi among the possible haplotypes, and store the haplotype frequency fi in said database, wherein the haplotype frequency fi is a function of the evidence score si;
(e) calculate for each genotype received in step (a) a pair score Fk for each pair of haplotypes that are consistent with that genotype, wherein Fk is a function of the frequency fi for each of the haplotypes in the pair;
(f) calculate, for each genotype and consistent haplotype pair whose pair score Fk meets a pair score criterion, a probability pk that assignment of that haplotype pair to the genotype would be correct and store the probability pk in said database;
(g) calculate a revised haplotype frequency fi for each of the haplotypes, wherein the revised haplotype frequency fi is a function of the probability pk for each consistent haplotype pair which contains the haplotype and storing the revised frequency fi in said database; and
(h) repeat steps e through g until an end condition is reached, with the proviso that for each repetition the frequency fi employed in step (e) is replaced by the revised frequency fi determined in step (g) and stored in the database. - View Dependent Claims (33, 34, 35, 36, 39)
-
-
37. A computer readable medium comprising instruction code to:
-
(a) accept input of a genotype for the polymorphic genomic region from each of the individuals and store said genotype within said database;
(b) enumerate all possible haplotypes hi consistent with each genotype and store said haplotypes hi within said database;
(c) calculate an evidence score si for each of said possible haplotypes hi and store said evidence score si within said database;
(d) calculate an initial haplotype frequency fi for each haplotype hi among the possible haplotypes, and store the haplotype frequency fi in said database, wherein the haplotype frequency fi is a function of the evidence score si;
(e) calculate for each genotype received in step (a) a pair score Fk for each pair of haplotypes that are consistent with that genotype, wherein Fk is a function of the frequency fi for each of the haplotypes in the pair;
(f) calculate, for each genotype and consistent haplotype pair whose pair score Fk meets a pair score criterion, a probability pk that assignment of that haplotype pair to the genotype would be correct and store the probability pk in said database;
(g) calculate a revised haplotype frequency fi for each of the haplotypes, wherein the revised haplotype frequency fi is a function of the probability pk for each consistent haplotype pair which contains the haplotype and storing the revised frequency fi in said database; and
(h) repeat steps e through g until an end condition is reached, with the proviso that for each repetition the frequency fi employed in step (e) is replaced by the revised frequency fi determined in step (g). - View Dependent Claims (40)
-
-
41. A method for assigning haplotype pairs for a polymorphic genomic region to a plurality of individuals, comprising:
-
(a) obtaining a genotype for the polymorphic genomic region from each of the individuals;
(b) grouping the genotypes obtained in step (a) into groups, wherein in each group g there are ng identical genotypes, and wherein any unique genotypes are regarded as groups having ng=1;
(c) enumerating all possible haplotypes hi that are consistent with each distinct genotype;
(d) assigning an evidence score si to each of the enumerated possible haplotypes hi;
(e) for each group g, calculating an initial haplotype frequency (fi) for each haplotype among the possible haplotypes, wherein the initial haplotype frequency fi is a function of the product (si)(ng);
(f) determining for each group g, a pair score Fk for each pair of haplotypes that is consistent with the genotype of that group, wherein Fk is a function of the frequency fi for each of the haplotypes in the pair;
(g) calculating, for each genotype and consistent haplotype pair whose pair score Fk meets a pair score criterion, a probability pk that assignment of that haplotype pair to the genotype would be correct;
(h) generating a revised haplotype frequency fi for each haplotype, wherein the revised haplotype frequency fi is a function of the product (ng)(pk) for each consistent haplotype pair which contains the haplotype; and
(i) repeating steps (f) through (h) until an end condition is reached, with the proviso that for each repetition the frequency fi employed in step (f) is replaced by the revised frequency fi determined in step (h). - View Dependent Claims (42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 78)
-
-
72. A computer system for assigning haplotype pairs for a polymorphic genomic region to a plurality of individuals, comprising:
-
a database for storing genotyping information;
a processor connected to the database; and
a computer program for controlling the processor connected to said database, comprising instruction code to;
(a) accept input of a genotype for the polymorphic genomic region from each of the individuals and store said genotype within said database;
(b) group the genotypes input in step (a) into groups, wherein in each group g there are ng identical genotypes, and wherein any unique genotypes are regarded as groups having ng=1;
(c) enumerate all possible haplotypes hi consistent with the genotype of each group g, and store said haplotypes hi within said database;
(d) calculate an evidence score si for each of said possible haplotypes hi and store said evidence score si within said database;
(e) for each group g, calculate an initial haplotype frequency fi for each haplotype hi among the possible haplotypes, and store the haplotype frequency fi in said database, wherein the initial haplotype frequency fi is a function of the product (si)(ng);
(f) calculate for each group obtained in step (b) a pair score Fk for each pair of haplotypes that are consistent with that group, wherein Fk is a function of the frequency fi for each of the haplotypes in the pair;
(g) calculate, for each genotype and consistent haplotype pair whose pair score Fk meets a pair score criterion, a probability pk that assignment of that haplotype pair to the genotype would be correct and store the probability pk in said database;
(h) calculate a revised haplotype frequency fi for each of the haplotypes, wherein the revised haplotype frequency fi is a function of the product (ng)(pk) for each consistent haplotype pair which contains the haplotype and storing the revised frequency fi in said database; and
(i) repeat steps (f) through (h) until an end condition is reached, with the proviso that for each repetition the frequency fi employed in step (f) is replaced by the revised frequency fi determined in step (h) and stored in the database. - View Dependent Claims (73, 74, 75, 76, 79)
-
-
77. A computer readable medium comprising instruction code to:
-
(a) accept input of a genotype for the polymorphic genomic region from each of the individuals and store said genotype within said database;
(b) group the genotypes input in step (a) into groups, wherein in each group g there are ng identical genotypes, and wherein any unique genotypes are regarded as groups having ng=1;
(c) enumerate all possible haplotypes hi consistent with the genotype of each group g, and store said haplotypes hi within said database;
(d) calculate an evidence score si for each of said possible haplotypes hi and store said evidence score si within said database;
(e) for each group g, calculate an initial haplotype frequency fi for each haplotype hi among the possible haplotypes, and store the haplotype frequency fi in said database, wherein the initial haplotype frequency fi is a function of the product (si)(ng);
(f) calculate for each group obtained in step (b) a pair score pk for each pair of haplotypes that are consistent with that group, wherein Fk is a function of the frequency fi for each of the haplotypes in the pair;
(g) calculate, for each genotype and consistent haplotype pair whose pair score Fk meets a pair score criterion, a probability pk that assignment of that haplotype pair to the genotype would be correct and store the probability pk in said database;
(h) calculate a revised haplotype frequency fi for each of the haplotypes, wherein the revised haplotype frequency fi is a function of the product (ng)(pk) for each consistent haplotype pair which contains the haplotype and storing the revised frequency fi in said database; and
(i) repeat steps (f) through (h) until an end condition is reached, with the proviso that for each repetition the frequency fi employed in step (f) is replaced by the revised frequency fi determined in step (h) and stored in the database. - View Dependent Claims (80)
-
Specification