Methods for obtaining and using haplotype data
First Claim
1. A method of generating a haplotype database for a population, comprising data elements representative of the haplotypes for at least one locus from the individuals in the population, the method comprising:
- (a) for each individual in the population, generating polymorphism and haplotype data elements representative of the individual'"'"'s polymorphisms and haplotypes for the locus; and
1) (b) storing the polymorphism and haplotype data elements for the individuals in a computer-readable database, wherein the data elements are organized according to the spatial relationships between the polymorphisms and haplotypes and a reference nucleotide sequence for the locus.
0 Assignments
0 Petitions
Accused Products
Abstract
Methods, computer program(s) and database(s) to analyze and make use of gene haplotype information. These include methods, program, and database to find and measure the frequency of haplotypes in the general population; methods, program, and database to find correlation'"'"'s between an individual'"'"'s haplotypes or genotypes and a clinical outcome; methods, program, and database to predict an individual'"'"'s haplotypes from the individual'"'"'s gen type for a gene; and methods, program, and database to predict an individual'"'"'s clinical response to a treatment based on the individual'"'"'s genotype or haplotype.
-
Citations
183 Claims
-
1. A method of generating a haplotype database for a population, comprising data elements representative of the haplotypes for at least one locus from the individuals in the population, the method comprising:
-
(a) for each individual in the population, generating polymorphism and haplotype data elements representative of the individual'"'"'s polymorphisms and haplotypes for the locus; and
1) (b) storing the polymorphism and haplotype data elements for the individuals in a computer-readable database, wherein the data elements are organized according to the spatial relationships between the polymorphisms and haplotypes and a reference nucleotide sequence for the locus. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A method of predicting the presence of a haplotype pair in an individual comprising:
-
(a) identifying a genotype for the individual;
(b) enumerating all possible haplotype pairs which are consistent with the genotype;
(c) accessing a database containing reference haplotype pair frequency data to determine a probability, for each of the possible haplotype pairs, that the individual has a possible haplotype pair; and
(d) analyzing the determined probabilities to predict haplotype pairs for the individual. - View Dependent Claims (10, 11, 12)
-
-
13. A method for identifying a correlation between a haplotype pair and a clinical response to a treatment, or other phenotype, comprising:
-
(a) accessing a database containing data on clinical responses to treatments, or other phenotypes, exhibited by a clinical population;
(b) selecting a candidate locus hypothesized to be associated with the clinical response or other phenotype, the locus comprising at least two polymorphic sites;
(c) providing haplotype data for each member of the clinical population, the haplotype data comprising information on a plurality of polymorphic sites present in the candidate locus;
(d) storing the haplotype data; and
(e) calculating the degree of correlation between haplotype pairs and the clinical response to a treatment, or other phenotype, by statistically analyzing the haplotype and clinical response data. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20, 21)
-
-
22. A method for identifying a correlation between a haplotype pair and susceptibility to a condition or disease of interest, or other phenotype of interest, comprising the steps of:
-
(a) selecting a candidate locus hypothesized to be associated with the phenotype, condition or disease of interest, the locus comprising at least two polymorphic sites;
(b) providing haplotype data for the candidate locus for each member of a population having the phenotype, condition or disease of interest (“
disease haplotype data”
);
(c) organizing the disease haplotype data in a database;
(d) statistically analyzing the disease haplotype data to calculate haplotype pair frequencies;
(e) accessing a database containing haplotype data for the candidate locus for each member of a healthy reference population (“
reference haplotype data”
);
(f) statistically analyzing the reference haplotype data to calculate haplotype pair frequencies; and
(g) when a haplotype pair has a higher frequency in the population having the phenotype, condition or disease of interest than in the healthy reference population, identifying a correlation of the haplotype pair with susceptibility to the disease or condition of interest. - View Dependent Claims (23, 24, 25, 26, 27, 28, 29)
-
-
30. A method of predicting an individual'"'"'s response to a medical or pharmaceutical treatment, comprising:
-
(a) selecting at least one candidate gene for which a correlation between haplotype content and response to the treatment has been identified;
(b) determining the haplotype pair of the individual for the candidate gene or genes; and
(c) predicting that the individual'"'"'s response will be the response associated haplotype pair with information on the correlation. - View Dependent Claims (31, 32, 33)
-
-
34. A computer implemented method for generating a gene structure screen for display on a display device, comprising the steps of:
-
(a) retrieving from a database and displaying in a first area data indicative of the frequencies of occurrence of a gene'"'"'s haplotypes within predetermined member groupings of a reference population;
(b) retrieving from a database and displaying in a second area data indicative of the frequencies of occurrence of particular nucleotides for the member groupings;
(c) retrieving from a database data indicative of gene structure;
(d) displaying in a third area a graphical representation of gene structure that identifies polymorphic sites on the gene;
(e) selecting one of the polymorphic sites to cause the appropriate nucleotide frequencies to be displayed in the second area.
-
-
35. A computer implemented method for generating a haplotype pair frequency screen for display on a display device, comprising the steps of:
-
(a) displaying in a first area a plurality of selectable items each corresponding to a polymorphic site for a predetermined gene;
(b) selecting one or more of said selectable items;
(c) displaying in a second area the haplotype pairs occurring in a reference population for the selected polymorphic sites;
(d) displaying in a third area data indicative of haplotype frequencies for a plurality of member groupings within the population.
-
-
36. A computer implemented method for generating a linkage screen for display on a display device, comprising the steps of:
-
(a) displaying in a first area a graphical scale showing a reference for determining progressive degrees of linkage between polymorphic sites in a population;
(b) displaying in a second area a graphical matrix structure having a plurality of grids, where each axis of the structure represents polymorphic sites on a gene; and
where each grid graphically displays an indication of degree of linkage between polymorphic sites corresponding to that grid, in accordance with the reference shown in the first area. - View Dependent Claims (37)
-
-
38. A computer implemented method for generating a phylogenetic tree screen for display on a display device, comprising the steps of:
-
(a) displaying in a first area a plurality of selectable items each corresponding to a polymorphic site for a predetermined gene;
(b) selecting one or more of said selectable items;
(c) displaying in a second area a phylogenetic tree structure having nodes for each haplotype in a population, where the distance between nodes is indicative of the number of nucleotides that would have to be flipped to change one haplotype into another. - View Dependent Claims (39, 40)
-
-
41. A computer implemented method for generating a genotype analysis screen for display on a display device, comprising the steps of:
-
(a) displaying a first plurality of selectable items each corresponding to a polymorphic site, and a plurality of second selectable items each corresponding to a polymorphic site;
(b) displaying a graphical scale showing a reference for determining progressive degrees of haplotype identification reliability using genotyping;
(c) displaying a graphical matrix structure having a plurality of grids, where each axis represents a haplotype indicated by the first selectable items; and
where each grid graphically displays an indication of degree of identification reliability for identifying the haplotype corresponding to that grid using genotyping specified by the second selectable items, in accordance with the reference. - View Dependent Claims (42)
-
-
43. A method of displaying clinical response values of a subject population as a function of haplotype pairs of the individuals in the population, comprising:
-
(a) receiving from a computer-readable storage device, data representing haplotype pairs and clinical response values for the subject population;
(b) graphically displaying a haplotype pair matrix each of whose cells contains a graphical representation of the clinical response values of individuals having the haplotype pair corresponding to that cell of the haplotype pair matrix. - View Dependent Claims (45, 46, 47, 48, 49, 50, 51)
-
-
44. A method of displaying clinical response values of a subject population as a function of haplotype pairs of the individuals in the population, comprising:
-
(a) displaying one or more first selectable items representing polymorphic sites for a predetermined gene, which when selected, will generate haplotype pairs;
(b) displaying a second selectable item representing a clinical response measurement;
which, when selected in conjunction with the first selectable items will cause display of a haplotype pair matrix, each of whose cells contains a graphical representation of the clinical response values for the selected clinical measurement of individuals having the haplotype pair corresponding to that cell of the haplotype pair matrix.
-
-
52. A computer-implemented method for carrying out a genetic algorithm for finding an optimal set of weights to fit a function of polymorphic site data to a clinical response measurement comprising:
-
(a) displaying a variable controller for setting the number of genetic algorithm generations parameter;
(b) displaying a variable controller for setting the number of agents parameter;
(c) displaying a variable controller for setting the mutation rate parameter;
(d) displaying a variable controller for setting the crossover rate parameter;
(e) displaying one or more selectable items each corresponding to a polymorphic site of a predetermined gene; and
(f) displaying a selectable item for initiation of the genetic algorithm calculation;
wherein selection of one or more selectable items corresponding to a polymorphic site, and selection of the item for initiation of the genetic algorithm calculation, results in the execution of the genetic algorithm calculation with the parameters set by the variable controllers, and the display of the residual error of the model as a function of the number of genetic algorithm generations and a display of the results of the genetic algorithm calculation showing the optimal weights for each of the polymorphic sites.
-
-
53. A computer-implemented method for displaying correlations between clinical outcome values for a selected population, comprising:
-
2) (a) displaying a first plurality of selectable items corresponding to the clinical outcome variables;
3) (b) displaying a second plurality of selectable items corresponding to the clinical outcome variables; and
4) (c) displaying a scatter plot of data points corresponding to the individuals in the selected population;
5) wherein selecting first item from the first plurality of selectable items causes each data point to be plotted on the x axis of the scatter plot according to the value of the corresponding clinical outcome value for the individual associated with the data point, and wherein selection of a second item from the second plurality of selectable items causes each data point to be plotted on the y axis of the scatter plot according to the value of the corresponding clinical outcome value for the individual associated with the data point.
-
-
54. A method for conducting a clinical trial of a treatment protocol for a medical condition of interest, comprising:
-
(a) selecting one or more genes (or other loci) known or expected to be involved in a particular disease or drug response;
(b) defining a reference population of healthy individuals with a broad and representative genetic background;
(c) sequencing DNA from each member of the reference population;
(d) determining the haplotypes for each of the selected genes (or other loci) for each member of the reference population;
(e) determining the frequencies, population distributions and statistical measures, including confidence limits, for each of the determined haplotypes;
(f) recruiting a trial population of individuals who have the medical condition of interest;
(g) treating individuals in the trial population according to the treatment protocol, and measuring their response to treatment;
(h) determining the haplotypes for each of the selected genes (or other loci) for each member of the trial population;
(i) determining the correlations between individual responses to the treatment and individual haplotype content for each of the selected genes (or other loci); and
(j) from these correlations, constructing a model that predicts the response of an individual to the treatment, given the individual'"'"'s haplotype content. - View Dependent Claims (55)
-
-
56. A method of inferring genotypes of individual subjects for a selected gene having at least m polymorphic sites, comprising
(a) providing a database of m-site haplotypes of the selected gene from a representative cohort of individuals; -
(b) tabulating the frequency of occurrence for each of the haplotypes;
(c) constructing a list of all genotypes that could result from all possible pairs of observed haplotypes;
(d) calculating the expected frequency of these genotypes assuming the Hardy-Weinberg equilibrium;
(e) generating a complete set of all possible masks of the same length m as the haplotypes, wherein each mask blocks the identity of the nucleotides at m-n polymorphic sites and admits the identity of nucleotides at the other n sites;
(f) for each mask, calculating how much ambiguity results from genotyping with only the n polymorphic sites whose identity is admitted by the mask;
(g) from among those masks having an acceptable level of ambiguity, selecting a mask which has the lowest value of n;
(h) genotyping the subjects by measuring only the n polymorphic sites that are admitted by the selected mask; and
(i) assigning to each subject having a particular n-site haplotype, the full m-site haplotype of a member of the initial cohort having the same n-site haplotype. - View Dependent Claims (57, 58)
-
-
59. A method of determining polymorphic sites or sub-haplotypes that correlate with a clinical response or outcome of interest, comprising:
-
(a) providing haplotype information, and clinical response or outcome data (clinical outcome values) from a cohort of subjects;
(b) statistically analyzing each individual SNP in the haplotype for the degree to which it correlates with the clinical outcome values, and generating a numerical measure of the degree of correlation;
(c) saving for further processing those individual SNPs whose numerical measure of the degree of correlation with the clinical outcome values exceeds a first cut-off value;
(d) generating all possible pair-wise combinations of the saved SNPs so as to provide a set of n-site sub-haplotypes where n=2;
(e) statistically analyzing each newly generated n-site sub-haplotype for the degree to which it correlates with the clinical outcome values and calculating a numerical measure of the degree of correlation;
(f) saving for further processing those n-site sub-haplotypes whose numerical measure of the degree of correlation with the clinical outcome values exceeds the first cut-off value;
(g) generating all possible pair-wise combinations among and between the saved SNPs and saved sub-haplotypes, to produce new subhaplotypes with increased values of n;
(h) repeating steps (e) through (g) until either (i) no new sub-haplotypes can be generated, or (ii) no further sub-haplotypes having n less than a pre-selected limit can be generated. - View Dependent Claims (60, 61, 62, 63)
-
-
64. A method of determining polymorphic sites or sub-haplotypes that correlate with a clinical response or outcome of interest, comprising:
-
(a) providing single gene haplotype information for one or more genes, and clinical response or outcome data, from a cohort of subjects;
(b) statistically analyzing each single gene haplotype for the degree to which it correlates with the clinical response or outcome of interest, and calculating a numerical measure of the degree of correlation;
(c) saving for further processing those haplotypes whose numerical measure of the degree of correlation with the clinical response or outcome of interest exceeds a first selected value;
(d) for each haplotype composed of m polymorphic sites, generating all possible sub-haplotypes having a single site masked, so as to provide a set of sub-haplotypes having (m−
n) sites, where n=1;
(e) statistically analyzing each newly generated sub-haplotype for the degree to which it correlates with the clinical response or outcome of interest, and calculating a numerical measure of the degree of correlation;
(f) saving for further processing those sub-haplotypes whose numerical measure of the degree of correlation with the clinical response or outcome of interest exceeds the first selected value;
(g) from the saved sub-haplotypes, generating all possible sub-haplotypes having one additional site masked;
(h) repeating steps (e) through (g) until either (i) no new sub-haplotypes have a degree of correlation which exceeds the first selected value, or (ii) no further sub-haplotypes having more unmasked sites than a pre-selected limit can be generated. - View Dependent Claims (65, 66, 67, 68)
-
-
69. A computer-usable medium having computer-readable program code stored thereon, for causing a computer to adjust observed haplotype pair frequencies within a population group, said haplotype pair frequencies being stored in a computer-readable database of haplotype information for a gene or gene feature of interest, the computer-readable program code comprising:
-
(a) computer-readable program code for causing a computer to access said database and generate all possible haplotype pairs consistent with the stored genotypes;
(b) computer-readable program code for causing a computer to calculate the expected frequency of the generated haplotypes and haplotype pairs according to the Hardy-Weinberg equilibrium, based upon the observed distribution of haplotypes or haplotype pairs in the population; and
(c) computer-readable program code for causing a computer to select the most probable haplotype pair for the individual based on the observed. - View Dependent Claims (70, 71, 72)
-
-
73. A computer-usable medium having computer-readable program code stored thereon, for causing haplotype pair assignments to be made to an individual member of a population whose genotype information for a gene or gene feature of interest is stored in a computer-readable form, the computer-readable program code comprising:
-
(a) computer-readable program code for causing a computer to generate all possible haplotype pairs consistent with the stored genotype;
(b) computer-readable program code for causing a computer to access a database containing reference haplotype pair frequency data and to determine from the frequency data the probability, for each of the possible haplotype pairs, that the individual has the possible haplotype pair; and
(c) computer-readable program code for causing a computer to select the most probable haplotype pair for the individual.
-
-
74. A computer-usable medium having computer-readable program code stored thereon, for causing a computer to identify a correlation between a clinical response to a treatment or other phenotype and a haplotype or haplotype pair present at a candidate locus hypothesized to be associated with the clinical response other phenotype, the computer-readable program code comprising:
-
(a) computer-readable program code for causing a computer to access a database containing data on clinical responses to treatments, or other phenotypes, exhibited by individuals in a clinical population;
(b) computer-readable program code for causing a computer to access a database containing haplotype data for each individual of the clinical population, the haplotype data comprising information on a plurality of polymorphic sites present at the candidate locus; and
(c) computer-readable program code for causing a computer to calculate the degree of correlation between haplotype pairs and the clinical response to the treatment or other phenotype, by statistical analysis of the haplotype and clinical response data. - View Dependent Claims (75, 76, 77, 78)
-
-
79. A computer-usable medium having computer-readable program code stored thereon, for causing a computer to identify a correlation between an individual'"'"'s susceptibility to a condition or disease of interest, or other phenotype, and a haplotype or haplotype pair present at a candidate locus hypothesized to be associated with susceptibility to the condition or disease of interest, or with a phenotype of interest, the computer-readable program code comprising:
-
(a) computer-readable program code for causing a computer to access haplotype data for the candidate locus for each member of a population having the phenotype or condition or disease of interest (“
disease haplotype data”
);
(b) computer-readable program code for causing a computer to statistically analyze the disease haplotype data to calculate haplotype or haplotype pair frequencies;
(c) computer-readable program code for causing a computer to access a database containing haplotype data for the candidate locus for each member of a healthy reference population (“
reference haplotype data”
);
(d) computer-readable program code for causing a computer to statistically analyze the reference haplotype data to calculate haplotype or haplotype pair frequencies; and
(e) computer-readable program code for causing a computer to identify a correlation of a haplotype or haplotype pair with susceptibility to the disease or condition of interest, or with the phenotype of interest, when the haplotype or haplotype pair has a higher frequency in the population having the phenotype, condition or disease of interest than in the reference population. - View Dependent Claims (80, 81, 82)
-
-
83. A computer-usable medium having computer-readable program code stored thereon, for causing a computer to predict an individual'"'"'s response to a medical or pharmaceutical treatment based on one or more selected haplotypes or haplotype pairs of the individual, the computer-readable program code comprising:
-
(a) computer-readable program code for causing a computer to access a database of correlations between haplotypes or haplotype pairs and responses to the medical or pharmaceutical treatment in a reference population;
(b) computer-readable program code for causing a computer to locate haplotypes or haplotype pairs in the database that match the selected haplotype pairs of the individual, and (c) computer-readable program code for causing a computer to predict that the individual'"'"'s response will be the response or responses associated in the database with the selected haplotype or haplotype pair. - View Dependent Claims (84)
-
-
85. A computer-usable medium having computer-readable program code stored thereon, for causing a computer to display a gene'"'"'s structure and gene features on a display device, the computer-readable program code comprising:
-
(a) computer-readable program code for causing a computer to retrieve from a database, and display in a first area of the display device, data indicative of the frequencies of occurrence of a gene'"'"'s haplotypes within predetermined member groupings of a reference population;
(b) computer-readable program code for causing a computer to retrieve from a database data indicative of the gene'"'"'s structure and gene features;
(c) computer-readable program code for causing a computer to display in a second area of the display device a graphical representation of the gene'"'"'s structure, user-selectable items indicating the location of gene features, and graphical indicators of the location of polymorphic sites on the gene;
(d) computer-readable program code for causing a computer to display in a third area of the display device, in response to a user'"'"'s selection of an item indicating a gene feature, a graphical representation of the structure of the gene feature having user-selectable items indicating the position of polymorphic sites; and
(e) computer-readable program code for causing a computer to retrieve from a database, and display in a third area of the display device, in response to a user'"'"'s selection of an item indicating the position of a polymorphic site, data indicative of the frequencies within the member groupings of the occurrence of particular nucleotides at the polymorphic site.
-
-
86. A computer-usable medium having computer-readable program code stored thereon, for causing a computer to display on a display device haplotype pair frequency data within a population of individuals, for a selected gene or gene feature, the computer-readable program code comprising:
-
(a) computer-readable program code for causing a computer to display on the display device a plurality of selectable items, each item corresponding to a polymorphic site in the gene or gene feature;
(c) computer-readable program code for causing a computer to retrieve from a database and display on the display device, in response to a user'"'"'s selection of one or more items indicating polymorphic sites, individual haplotype pairs in the database that differ at one or more of the selected polymorphic sites; and
(d) computer-readable program code for ca sing a computer to display on the display device data indicative of the frequencies of the displayed haplotype pairs within one or more member groupings within the population.
-
-
87. A computer-usable medium having computer-readable program code stored thereon, for causing a computer to display on a display device polymorphic site linkage data for a gene or gene structure of interest, the computer-readable program code comprising:
-
(a) computer-readable program code for causing a computer to display on the display device one or more matrix structures, wherein the axes of each matrix structure represent the polymorphic sites in the gene or gene feature of interest, and wherein each matrix structure corresponds to a different population or population group; and
(b) computer-readable program code for causing a computer to display on the display device, in each cell of a matrix structure, a graphical indication of degree of linkage between the twp polymorphic sites corresponding to the coordinates of the cell in the matrix. - View Dependent Claims (88)
-
-
89. A computer-usable medium having computer-readable program code stored thereon, for causing a computer to display on a display device a phylogenetic tree, the computer-readable program code comprising:
-
(a) computer-readable program code for causing a computer to display a plurality of selectable items, each corresponding to a polymorphic site in the gene or gene feature of interest; and
(b) computer-readable program code for causing a computer to display a phylogenetic tree structure having a node for each haplotype in a population, where the distance between nodes is proportional to the minimum number of nucleotides that would have to be changed to interconvert the corresponding haplotypes. - View Dependent Claims (90, 91)
-
-
92. A computer-usable medium having computer-readable program code stored thereon, for causing a computer to display a genotype analysis screen on a display device, the computer-readable program code comprising:
-
(a) computer-readable program code for causing a computer to display a first plurality of selectable items, each corresponding to a polymorphic site, and a second plurality of selectable items, each corresponding to a polymorphic site;
(b) computer-readable program code for causing a computer to display on the display device a matrix structure, wherein the axes of the matrix structure represent haplotypes in the gene or gene feature of interest that vary at the polymorphic sites selected from the first plurality of selectable items; and
(c) computer-readable program code for causing a computer to display on the display device, in each cell of the matrix structure, a graphical indication of the reliability of the assignment to an individual of the haplotype pair corresponding to the coordinates of the cell in the matrix, when the individual is genotyped only at the polymorphic sites selected from the second plurality of selectable items. - View Dependent Claims (93)
-
-
94. A computer-usable medium having computer-readable program code stored thereon, for causing a computer to display clinical response values, or other phenotype data, of a subject population as a function of haplotype pairs of the individuals in the population, the computer-readable program code comprising:
-
(a) computer-readable program code for causing a computer to retrieve from a computer-readable storage device, data representing haplotype pairs and clinical response values, or other phenotype data, for the subject population; and
(b) computer-readable program code for causing a computer to graphically display a haplotype pair matrix structure, each of whose cells contains a graphical representation of the clinical response values or other phenotype data of individuals having the haplotype pair corresponding to the coordinates of that cell in the haplotype pair matrix. - View Dependent Claims (96, 97, 98, 99, 100, 101, 102)
-
-
95. A computer-usable medium having computer-readable program code stored thereon, for causing a computer to display on a display device clinical response values, or other phnotypic data, of a subject population as a function of the haplotype pairs of the individuals in the population for a gene or gene feature of interest, the computer-readable program code comprising:
-
(a) computer-readable program code for causing a computer to display one or more first selectable items representing polymorphic sites of the gene of gene feature;
(b) computer-readable program code for causing a computer to display one or more second selectable items representing clinical measurements or phenotypes; and
(c) computer-readable program code for causing a computer to display on the display device, in response to the selection by the user of at least one first and second selectable items, a haplotype pair matrix structure, wherein the axes of the matrix structure represent haplotypes in the gene or gene feature of interest that vary at the polymorphic sites corresponding to the first selected item or items, and wherein each of the cells of the matrix contains a graphical representation of the mean clinical response value, or other phenotype data, for the clinical measurement represented by the selected second item, of individuals having the haplotype pair corresponding to the coordinates of the cell in the haplotype pair matrix.
-
-
103. A computer-usable medium having computer-readable program code stored thereon, for causing a computer to carry out a genetic algorithm for finding an optimal set of weights to fit a function of polymorphic site data for a gene or gene feature of interest to a clinical response measurement, the computer-readable program code comprising:
-
(a) computer-readable program code for causing a computer to display a variable controller for setting the number of genetic algorithm generations parameter;
(b) computer-readable program code for causing a computer to display a variable controller for setting the number of agents parameter;
(c) computer-readable program code for causing a computer to display a variable controller for setting the mutation rate parameter;
(d) computer-readable program code for causing a computer to display a variable controller for setting the crossover rate parameter;
(e) computer-readable program code for causing a computer to display one or more selectable items each corresponding to a polymorphic site of the gene or gene feature of interest; and
(f) computer-readable program code for causing a computer to displaying a selectable item for initiation of the genetic algorithm calculation; and
(g) computer-readable program code for causing a computer, in response to the selection by the user of one or more selectable items corresponding to a polymorphic site, and selection by the user of the item for initiation of the genetic algorithm caclulation, to execute the genetic algorithm calculation with the parameters set by the variable controllers, and to display on a display device (i) the residual error of the model as a function of the number of genetic algorithm generations, and (ii) the results of the genetic algorithm calculation showing the optimal weights for each of the polymorphic sites.
-
-
104. A computer-usable medium having computer-readable program code stored thereon, for causing a computer to display on a display device correlations between clinical outcome values obtained from selected clinical outome measures for a selected population, the computer-readable program code comprising:
-
6) (a) computer-readable program code for causing a computer to display a first plurality of selectable items corresponding to clinical outcome measurements;
7) (b) computer-readable program code for causing a computer to display a second plurality of selectable items corresponding to clinical outcome measurements; and
8) (c) computer-readable program code for causing a computer to display a scatter plot of data points, each data point corresponding to an individual in the selected population;
9) (d) computer-readable program code for causing a computer, in response to selection by the user of an item from among the first plurality of selectable items, to locate each data point along the x axis of the scatter plot according to the clinical outcome value for the associated individual from the clinical measurement represented by the selected item; and
10) (e) computer-readable program code for causing the computer, in response to selection by the user of an item from among the second plurality of selectable items, to locate each data point along the y axis of the scatter plot according to the clinical outcome value for the associated individual from the clinical measurement represented by the selected item.
-
-
105. A computer-usable medium having computer-readable program code stored thereon, for causing a computer to provide information of use in conducting a clinical trial of a treatment protocol for a medical condition of interest, the computer-readable program code comprising:
-
(a) computer-readable program code for causing a computer to access a database of DNA sequence data for selected genes or other loci in a reference population of individuals, and to access a database of (or accept as input) DNA sequence data for selected genes or other loci in a clinical trial population of individuals;
(b) computer-readable program code for causing a computer to assign to each member of the reference population haplotypes for each of the selected genes or other loci;
(c) computer-readable program code for causing a computer to calculate the frequencies, population distributions and statistical measures, including confidence limits, for each of the assigned haplotypes in the reference population;
(d) computer-readable program code for causing a computer to assign to each member of a trial population haplotypes for each of the selected genes or other loci, based upon the frequencies, population distributions and statistical measures calculated in the reference population;
(e) computer-readable program code for causing a computer to determinine the correlations between individual responses to the treatment and individual haplotypes, for each of the selected genes or other loci;
(f) computer-readable program code for causing a computer to accept as input an individual'"'"'s DNA sequence data or haplotypes for one or more of the selected genes or other loci; and
(g) computer-readable program code for causing a computer to display or output the expected response of the individual to the treatment, based on the determined correlations between individual responses to the treatment and individual haplotypes. - View Dependent Claims (106)
-
-
107. A computer-usable medium having computer-readable program code stored thereon, for causing a computer to infer genotypes of individual subjects for a selected gene having at least m polymorphic sites, the computer-readable program code comprising:
-
(a) computer-readable program code for causing a computer to access a database of m-site haplotypes of the selected gene from a representative cohort of individuals;
(b) computer-readable program code for causing a computer to tabulate the frequency of occurrence for each of the haplotypes;
(c) computer-readable program code for causing a computer to construct a list of all genotypes that could result from all possible pairs of observed haplotypes;
(d) computer-readable program code for causing a computer to calculate the expected frequency of these genotypes assuming the Hardy-Weinberg equilibrium;
(e) computer-readable program code for causing a computer to generate a complete set of all possible masks of the same length m as the haplotypes, wherein each mask blocks the identity of the nucleotides at m-n polymorphic sites and admits the identity of nucleotides at the other n sites;
(f) computer-readable program code for causing a computer to for calculate, for each mask, how much ambiguity results from genotyping with only the n polymorphic sites whose identity is admitted by the mask;
(g) computer-readable program code for causing a computer to output or display on a display device the calculated ambiguity for one or more masks. - View Dependent Claims (108, 109)
-
-
110. A computer-usable medium having computer-readable program code stored thereon, for causing a computer to determine polymorphic sites or sub-haplotypes that correlate with a clinical response or outcome of interest, or other phenotype, the computer-readable program code comprising:
-
(a) computer-readable program code for causing a computer to access a database containing haplotype information, and clinical response or outcome data (clinical outcome values) or other phenotype data, from a cohort of subjects;
(b) computer-readable program code for causing a computer to statistically analyze each individual SNP in the haplotype for the degree to which it correlates with the clinical outcome values or other phenotype data, and generating a numerical measure of the degree of correlation;
(c) computer-readable program code for causing a computer to store for further processing those individual SNPs whose numerical measure of the degree of correlation with the clinical outcome values or other phenotype data exceeds a first cut-off value;
(d) computer-readable program code for causing a computer to generate all possible pair-wise combinations of the saved SNPs so as to provide a set of n-site sub-haplotypes where n=2;
(e) computer-readable program code for causing a computer to statistically analyze each newly generated n-site sub-haplotype for the degree to which it correlates with the clinical outcome values or other phenotype data, and calculate a numerical measure of the degree of correlation;
(f) computer-readable program code for causing a computer to store for further processing those n-site sub-haplotypes whose numerical measure of the degree of correlation exceeds the first cut-off value;
(g) computer-readable program code for causing a computer to generate all possible pair-wise combinations among and between the saved SNPs and saved sub-haplotypes, to produce new subhaplotypes with increased values of n;
(h) computer-readable program code for causing a computer to repeat steps (e) through (g) until either (i) no new sub-haplotypes can be generated, or (ii) no further sub-haplotypes having n less than a pre-selected or user-selected limit can be generated. - View Dependent Claims (111, 113, 114)
-
-
112. A computer-usable medium having computer-readable program code stored thereon, for causing a computer to determine polymorphic sites or sub-haplotypes that correlate with a clinical response or outcome of interest, or other phenotype, the computer-readable program code comprising:
-
(a) computer-readable program code for causing a computer to access a database containing haplotype information, and clinical response or outcome data (clinical outcome values) or other phenotype data, from a cohort of subjects;
(b) computer-readable program code for causing a computer to statistically analyze each individual SNP in the haplotype for the degree to which it correlates with the clinical outcome values or other phenotype data, and calculate the p-value for the degree of correlation;
(c) computer-readable program code for causing a computer to store for further processing those individual SNPs whose p-value for the degree of correlation does not exceed a first cut-off value;
(d) computer-readable program code for causing a computer to generate all possible pair-wise combinations of the saved SNPs so as to provide a set of n-site sub-haplotypes where n=2;
(e) computer-readable program code for causing a computer to statistically analyze each newly generated n-site sub-haplotype for the degree to which it correlates with the clinical outcome values or other phenotype data, and calculate the p-value for the degree of correlation;
(f) computer-readable program code for causing a computer to store for further processing those n-site sub-haplotypes whose p-value for the degree of correlation does not exceed the first cut-off value;
(g) computer-readable program code for causing a computer to generate all possible pair-wise combinations among and between the saved SNPs and saved sub-haplotypes, to produce new subhaplotypes with increased values of n;
(h) computer-readable program code for causing a computer to repeat steps (e) through (g) until either (i) no new sub-haplotypes can be generated, or (ii) no further sub-haplotypes having n less than a pre-selected or user-selected limit can be generated.
-
-
115. A computer-usable medium having computer-readable program code stored thereon, for causing a computer to determine polymorphic sites or sub-haplotypes that correlate with a clinical response or outcome of interest, or other phenotype of interest, the computer-readable program code comprising:
-
(a) computer-readable program code for causing a computer to access a database containing single gene haplotype information for one or more genes, and clinical response, outcome data, or other phenotype data from a cohort of subjects;
(b) computer-readable program code for causing a computer to statistically analyze each single gene haplotype for the degree to which it correlates with the clinical response, outcome, or phenotype of interest, and to generate a numerical measure of the degree of correlation;
(c) computer-readable program code for causing a computer to store for further processing those haplotypes whose numerical measure of the degree of correlation exceeds a first cut-off value;
(d) computer-readable program code for causing a computer to generate, for each haplotype composed of m polymorphic sites, all possible sub-haplotypes having a single site masked, so as to provide a set of m-n site sub-haplotypes where n=1;
(e) computer-readable program code for causing a computer to statistically analyze each newly generated sub-haplotype for the degree to which it correlates with the clinical response, outcome, or phenotype of interest, and calculating a numerical measure of the degree of correlation;
(f) computer-readable program code for causing a computer to save for further processing those sub-haplotypes whose numerical measure of the degree of correlation exceeds the first cut-off value;
(g) computer-readable program code for causing a computer to generate, from the saved sub-haplotypes, all possible sub-haplotypes having one additional site masked;
(h) computer-readable program code for causing a computer to repeat steps (e) through (g) until either (i) no new sub-haplotypes have a degree of correlation which exceeds the first cut-off value, or (ii) no further sub-haplotypes having more unmasked sites than a pre-selected limit can be generated. - View Dependent Claims (116, 119)
-
-
117. A computer-usable medium having computer-readable program code stored thereon, for causing a computer to determine polymorphic sites or sub-haplotypes that correlate with a clinical response or outcome of interest, or other phenotype of interest, the computer-readable program code comprising:
-
(a) computer-readable program code for causing a computer to access a database containing single gene haplotype information for one or more genes, and clinical response, outcome data, or other phenotype data from a cohort of subjects;
(b) computer-readable program code for causing a computer to statistically analyze each single gene haplotype for the degree to which it correlates with the clinical response, outcome, or phenotype of interest, and to calculate the p-value for the degree of correlation;
(c) computer-readable program code for causing a computer to store for further processing those haplotypes whose p-value for the degree of correlation does not exceed a first cut-off value;
(d) computer-readable program code for causing a computer to generate, for each haplotype composed of m polymorphic sites, all possible sub-haplotypes having a single site masked, so as to provide a set of m−
n site sub-haplotypes where n=1;
(e) computer-readable program code for causing a computer to statistically analyze each newly generated sub-haplotype for the degree to which it correlates with the clinical response, outcome, or phenotype of interest, and calculating the p-value for the degree of correlation;
(f) computer-readable program code for causing a computer to save for further processing those sub-haplotypes whose p-value for the degree of correlation does not exceed the first cut-off value;
(g) computer-readable program code for causing a computer to generate, from the saved sub-haplotypes, all possible sub-haplotypes having one additional site masked;
(h) computer-readable program code for causing a computer to repeat steps (e) through (g) until either (i) no new sub-haplotypes have a p-value which does not the first cut-off value, or (ii) no further sub-haplotypes having more unmasked sites than a pre-selected limit can be generated. - View Dependent Claims (118)
-
-
120. A computer programmed to cause haplotype pair assignments to be made to an individual member of a population whose genotype information for a gene or gene feature of interest is stored in a computer-readable form, the computer comprising a memory having at least one region for storing computer executable program code and a processor for executing the program code stored in memory, wherein the program code includes:
-
computer-readable program code for causing a computer to generate all possible haplotype pairs consistent with the stored genotype;
computer-readable program code for causing a computer to calculate the frequency of the haplotypes and haplotype pairs according to the Hardy-Weinberg equilibrium, based upon the observed distribution of haplotypes or haplotype pairs in the population; and
computer-readable program code for causing a computer to select the most probable haplotype pair for the individual. - View Dependent Claims (121, 122, 123)
-
-
124. A computer programmed to cause haplotype pair assignments to be made to an individual member of a population whose genotype information for a gene or gene feature of interest is stored in a computer-readable form, the computer comprising a memory having at least one region for storing computer executable program code and a processor for executing the program code stored in memory, wherein the program code includes:
-
computer-readable program code for causing a computer to generate all possible haplotype pairs consistent with the stored genotype;
computer-readable program code for causing a computer to access a database containing reference haplotype pair frequency data and to determine from the frequency data the probability, for each of the possible haplotype pairs, that the individual has the possible haplotype pair; and
computer-readable program code for causing a computer to select the most probable haplotype pair for the individual.
-
-
125. A computer programmed to identify a correlation between a clinical response to a treatment or other phenotype and a haplotype or haplotype pair present at a candidate locus hypothesized to be associated with the clinical response other phenotype, the computer comprising a memory having at least one region for storing computer executable program code and a processor for executing the program code stored in memory, wherein the program code includes:
-
(a) computer-readable program code for causing a computer to access a database containing data on clinical responses to treatments, or other phenotypes, exhibited by individuals in a clinical population;
(b) computer-readable program code for causing a computer to access a database containing haplotype data for each individual of the clinical population, the haplotype data comprising information on a plurality of polymorphic sites present at the candidate locus; and
(c) computer-readable program code for causing a computer to calculate the degree of correlation between haplotypes or haplotype pairs and the clinical response to the treatment or other phenotype, by statistical analysis of the haplotype and clinical response data. - View Dependent Claims (126, 127, 128, 129)
-
-
130. A computer programmed to identify a correlation between an individual'"'"'s susceptibility to a condition or disease of interest, or other phenotype, and a haplotype or haplotype pair present at a candidate locus hypothesized to be associated with susceptibility to the condition or disease of interest, or with a phenotype of interest, the computer comprising a memory having at least one region for storing computer executable program code and a processor for executing the program code stored in memory, wherein the program code includes:
-
(a) computer-readable program code for causing a computer to access haplotype data for the candidate locus for each member of a population having the phenotype or condition or disease of interest (“
disease haplotype data”
);
(b) computer-readable program code for causing a computer to statistically analyze the disease haplotype data to calculate haplotype or haplotype pair frequencies;
(c) computer-readable program code for causing a computer to access a database containing haplotype data for the candidate locus for each member of a healthy reference population (“
reference haplotype data”
);
(d) computer-readable program code for causing a computer to statistically analyze the reference haplotype data to calculate haplotype or haplotype pair frequencies; and
(e) computer-readable program code for causing a computer to identify a correlation of a haplotype or haplotype pair with susceptibility to the disease or condition of interest, or with the phenotype of interest, when the haplotype or haplotype pair has a higher frequency in the population having the phenotype, condition or disease of interest than in the reference population. - View Dependent Claims (131, 132, 133)
-
-
134. A computer programmed to predict an individual'"'"'s response to a medical or pharmaceutical treatment based on one or more selected haplotypes or haplotype pairs of the individual, the computer comprising a memory having at least one region for storing computer executable program code and a processor for executing the program code stored in memory, wherein the program code includes:
-
(a) computer-readable program code for causing a computer to access a database of correlations between haplotypes or haplotype pairs and responses to the medical or pharmaceutical treatment in a reference population;
(b) computer-readable program code for causing a computer to locate haplotypes or haplotype pairs in the database that match the selected haplotypes or haplotype pairs of the individual, and (c) computer-readable program code for causing a computer to predict that the individual'"'"'s response will be the response or responses associated in the database with the selected haplotype or haplotype pair. - View Dependent Claims (135)
-
-
136. A computer programmed to display a gene'"'"'s structure and gene features on a display device, the computer comprising a memory having at least one region for storing computer executable program code and a processor for executing the program code stored in memory, wherein the program code includes:
-
(a) computer-readable program code for causing a computer to retrieve from a database, and display in a first area of the display device, data indicative of the frequencies of occurrence of a gene'"'"'s haplotypes within predetermined member groupings of a reference population;
(b) computer-readable program code for causing a computer to retrieve from a database data indicative of the gene'"'"'s structure and gene features;
(c) computer-readable program code for causing a computer to display in a second area of the display device a graphical representation of the gene'"'"'s structure, user-selectable items indicating the location of gene features, and graphical indicators of the location of polymorphic sites on the gene;
(d) computer-readable program code for causing a computer to display in a third area of the display device, in response to a user'"'"'s selection of an item indicating a gene feature, a graphical representation of the structure of the gene feature having user-selectable items indicating the position of polymorphic sites; and
(e) computer-readable program code for causing a computer to retrieve from a database, and display in a third area of the display device, in response to a user'"'"'s selection of an item indicating the position of a polymorphic site, data indicative of the frequencies within the member groupings of the occurrence of particular nucleotides at the polymorphic site.
-
-
137. A computer programmed to display on a display device haplotype pair frequency data within a population of individuals, for a selected gene or gene feature, the computer comprising a memory having at least one region for storing computer executable program code and a processor for executing the program code stored in memory, wherein the program code includes:
-
(a) computer-readable program code for causing a computer to display on the display device a plurality of selectable items, each item corresponding to a polymorphic site in the gene or gene feature;
(c) computer-readable program code for causing a computer to retrieve from a database and display on the display device, in response to a user'"'"'s selection of one or more items indicating polymorphic sites, individual haplotype pairs in the database that differ at one or more of the selected polymorphic sites; and
(d) computer-readable program code for causing a computer to display on the display device data indicative of the frequencies of the displayed haplotype pairs within one or more member groupings within the population.
-
-
138. A computer programmed to display on a display device polymorphic site linkage data for a gene or gene structure of interest, the computer comprising a memory having at least one region for storing computer executable program code and a processor for executing the program code stored in memory, wherein the program code includes:
-
(a) computer-readable program code for causing a computer to display on the display device one or more matrix structures, wherein the axes of each matrix structure represent the polymorphic sites in the gene or gene feature of interest, and wherein each matrix structure corresponds to a different population or population group; and
(b) computer-readable program code for causing a computer to display on the display device, in each cell of a matrix structure, a graphical indication of degree of linkage between the twp polymorphic sites corresponding to the coordinates of the cell in the matrix. - View Dependent Claims (139)
-
-
140. A computer programmed to display on a display device a phylogenetic tree, the computer comprising a memory having at least one region for storing computer executable program code and a processor for executing the program code stored in memory, wherein the program code includes:
-
(a) computer-readable program code for causing a computer to display a plurality of selectable items, each corresponding to a polymorphic site in the gene or gene feature of interest; and
(b) computer-readable program code for causing a computer to display a phylogenetic tree structure having a node for each haplotype in a population, where the distance between nodes is proportional to the minimum number of nucleotides that would have to be changed to interconvert the corresponding haplotypes. - View Dependent Claims (141, 142)
-
-
143. A computer programmed to display a genotype analysis screen on a display device, the computer comprising a memory having at least one region for storing computer executable program code and a processor for executing the program code stored in memory, wherein the program code includes:
-
(a) computer-readable program code for causing a computer to display a first plurality of selectable items, each corresponding to a polymorphic site, and a second plurality of selectable items, each corresponding to a polymorphic site;
(b) computer-readable program code for causing a computer to display on the display device a matrix structure, wherein the axes of the matrix structure represent haplotypes in the gene or gene feature of interest that vary at the polymorphic sites selected from the first plurality of selectable items; and
(c) computer-readable program code for causing a computer to display on the display device, in each cell of the matrix structure, a graphical indication of the reliability of the assignment to an individual of the haplotype pair corresponding to the coordinates of the cell in the matrix, when the individual is genotyped only at the polymorphic sites selected from the second plurality of selectable items. - View Dependent Claims (144)
-
-
145. A computer programmed to display clinical response values, or other phenotype data, of a subject population as a function of haplotype pairs of the individuals in the population, the computer comprising a memory having at least one region for storing computer executable program code and a processor for executing the program code stored in memory, wherein the program code includes:
-
(a) computer-readable program code for causing a computer to retrieve from a computer-readable storage device, data representing haplotype pairs and clinical response values, or other phenotype data, for the subject population; and
(b) computer-readable program code for causing a computer to graphically display a haplotype pair matrix structure, each of whose cells contains a graphical representation of the clinical response values or other phenotype data of individuals having the haplotype pair corresponding to the coordinates of that cell in the haplotype pair matrix. - View Dependent Claims (147, 148, 149, 150, 151, 152, 153)
-
-
146. A computer programmed to display on a display device clinical response values, or other phnotypic data, of a subject population as a function of the haplotype pairs of the individuals in the population for a gene or gene feature of interest, the computer comprising a memory having at least one region for storing computer executable program code and a processor for executing the program code stored in memory, wherein the program code includes:
-
(a) computer-readable program code for causing a computer to display one or more first selectable items representing polymorphic-sites of the gene of gene feature;
(b) computer-readable program code for causing a computer to display one or more second selectable items representing clinical measurements or phenotypes; and
(c) computer-readable program code for causing a computer to display on the display device, in response to the selection by the user of at least one first and second selectable items, a haplotype pair matrix structure, wherein the axes of the matrix structure represent haplotypes in the gene or gene feature of interest that vary at the polymorphic sites corresponding to the first selected item or items, and wherein each of the cells of the matrix contains a graphical representation of the mean clinical response value, or other phenotype data, for the clinical measurement represented by the selected second item, of individuals having the haplotype pair corresponding to the coordinates of the cell in the haplotype pair matrix.
-
-
154. A computer programmed to carry out a genetic algorithm for finding an optimal set of weights to fit a function of polymorphic site data for a gene or gene feature of interest to a clinical response measurement, the computer comprising a memory having at least one region for storing computer executable program code and a processor for executing the program code stored in memory, wherein the program code includes:
-
(a) computer-readable program code for causing a computer to display a variable controller for setting the number of genetic algorithm generations parameter;
(b) computer-readable program code for causing a computer to display a variable controller for setting the number of agents parameter;
(c) computer-readable program code for causing a computer to display a variable controller for setting the mutation rate parameter;
(d) computer-readable program code for causing a computer to display a variable controller for setting the crossover rate parameter;
(e) computer-readable program code for causing a computer to display one or more selectable items each corresponding to a polymorphic site of the gene or gene feature of interest; and
(f) computer-readable program code for causing a computer to displaying a selectable item for initiation of the genetic algorithm calculation; and
(g) computer-readable program code for causing a computer, in response to the selection by the user of one or more selectable items corresponding to a polymorphic site, and selection by the user of the item for initiation of the genetic algorithm caclulation, to execute the genetic algorithm calculation with the parameters set by the variable controllers, and to display on a display device (i) the residual error of the model as a function of the number of genetic algorithm generations, and (ii) the results of the genetic algorithm calculation showing the optimal weights for each of the polymorphic sites.
-
-
155. A computer programmed to display on a display device correlations between clinical outcome values obtained from selected clinical outome measures for a selected population, the computer comprising a memory having at least one region for storing computer executable program code and a processor for executing the program code stored in memory, wherein the program code includes:
-
11) (a) computer-readable program code for causing a computer to display a first plurality of selectable items corresponding to clinical outcome measurements;
12) (b) computer-readable program code for causing a computer to display a second plurality of selectable items corresponding to clinical outcome measurements; and
13) (c) computer-readable program code for causing a computer to display a scatter plot of data points, each data point corresponding to an individual in the selected population;
14) (d) computer-readable program code for causing a computer, in response to selection by the user of an item from among the first plurality of selectable items, to locate each data point along the x axis of the scatter plot according to the clinical outcome value for the associated individual from the clinical measurement represented by the selected item; and
15) (e) computer-readable program code for causing the computer, in response to selection by the user of an item from among the second plurality of selectable items, to locate each data point along the y axis of the scatter plot according to the clinical outcome value for the associated individual from the clinical measurement represented by the selected item.
-
-
156. A computer programmed to provide information of use in conducting a clinical trial of a treatment protocol for a medical condition of interest, the computer comprising a memory having at least one region for storing computer executable program code and a processor for executing the program code stored in memory, wherein the program code includes:
-
(a) computer-readable program code for causing a computer to access a database of DNA sequence data for selected genes or other loci in a reference population of individuals, and to access a database of (or accept as input) DNA sequence data for selected genes or other loci in a clinical trial population of individuals;
(b) computer-readable program code for causing a computer to assign to each member of the reference population haplotypes for each of the selected genes or other loci;
(c) computer-readable program code for causing a computer to calculate the frequencies, population distributions and statistical measures, including confidence limits, for each of the assigned haplotypes in the reference population;
(d) computer-readable program code for causing a computer to assign to each member of a trial population haplotypes for each of the selected genes or other loci, based upon the frequencies, population distributions and statistical measures calculated in the reference population;
(e) computer-readable program code for causing a computer to determinine the correlations between individual responses to the treatment and individual haplotypes, for each of the selected genes or other loci;
(f) computer-readable program code for causing a computer to accept as input an individual'"'"'s DNA sequence data or haplotypes for one or more of the selected genes or other loci; and
(g) computer-readable program code for causing a computer to display or output the expected response of the individual to the treatment, based on the determined correlations between individual responses to the treatment and individual haplotypes. - View Dependent Claims (157)
-
-
158. A computer programmed to infer genotypes of individual subjects for a selected gene having at least m polymorphic sites, the computer comprising a memory having at least one region for storing computer executable program code and a processor for executing the program code stored in memory, wherein the program code includes:
-
(a) computer-readable program code for causing a computer to access a database of m-site haplotypes of the selected gene from a representative cohort of individuals;
(b) computer-readable program code for causing a computer to tabulate the frequency of occurrence for each of the haplotypes;
(c) computer-readable program code for causing a computer to construct a list of all genotypes that could result from all possible pairs of observed haplotypes;
(d) computer-readable program code for causing a computer to calculate the expected frequency of these genotypes assuming the Hardy-Weinberg equilibrium;
(e) computer-readable program code for causing a computer to generate a complete set of all possible masks of the same length m as the haplotypes, wherein each mask blocks the identity of the nucleotides at m−
n polymorphic sites and admits the identity of nucleotides at the other n sites;
(f) computer-readable program code for causing a computer to for calculate, for each mask, how much ambiguity results from genotyping with only the n polymorphic sites whose identity is admitted by the mask;
(g) computer-readable program code for causing a computer to output or display on a display device the calculated ambiguity for one or more masks. - View Dependent Claims (159, 160)
-
-
161. A computer programmed to determine polymorphic sites or sub-haplotypes that correlate with a clinical response or outcome of interest, or other phenotype, the computer comprising a memory having at least one region for storing computer executable program code and a processor for executing the program code stored in memory, wherein the program code includes:
-
(a) computer-readable program code for causing a computer to access a database containing haplotype information, and clinical response or outcome data (clinical outcome values) or other phenotype data, from a cohort of subjects;
(b) computer-readable program code for causing a computer to statistically analyze each individual SNP in the haplotype for the degree to which it correlates with the clinical outcome values or other phenotype data, and generating a numerical measure of the degree of correlation;
(c) computer-readable program code for causing a computer to store for further processing those individual SNPs whose numerical measure of the degree of correlation with the clinical outcome values or other phenotype data exceeds a first cut-off value;
(d) computer-readable program code for causing a computer to generate all possible pair-wise combinations of the saved SNPs so as to provide a set of n-site sub-haplotypes where n=2;
(e) computer-readable program code for causing a computer to statistically analyze each newly generated n-site sub-haplotype for the degree to which it correlates with the clinical outcome values or other phenotype data, and calculate a numerical measure of the degree of correlation;
(f) computer-readable program code for causing a computer to store for further processing those n-site sub-haplotypes whose numerical measure of the degree of correlation exceeds the first cut-off value;
(g) computer-readable program code for causing a computer to generate all possible pair-wise combinations among and between the saved SNPs and saved sub-haplotypes, to produce new subhaplotypes with increased values of n;
(h) computer-readable program code for causing a computer to repeat steps (e) through (g) until either (i) no new sub-haplotypes can be generated, or (ii) no further sub-haplotypes having n less than a pre-selected or user-selected limit can be generated. - View Dependent Claims (162, 164, 165)
-
-
163. A computer programmed to determine polymorphic sites or sub-haplotypes that correlate with a clinical response or outcome of interest, or other phenotype, the computer comprising a memory having at least one region for storing computer executable program code and a processor for executing the program code stored in memory, wherein the program code includes:
-
(a) computer-readable program code for causing a computer to access a database containing haplotype information, and clinical response or outcome data (clinical outcome values) or other phenotype data, from a cohort of subjects;
(b) computer-readable program code for causing a computer to statistically analyze each individual SNP in the haplotype for the degree to which it correlates with the clinical outcome values or other phenotype data, and calculate the p-value for the degree of correlation;
(c) computer-readable program code for causing a computer to store for further processing those individual SNPs whose p-value for the degree of correlation does not exceed a first cut-off value;
(d) computer-readable program code for causing a computer to generate all possible pair-wise combinations of the saved SNPs so as to provide a set of n-site sub-haplotypes where n=2;
(e) computer-readable program code for causing a computer to statistically analyze each newly generated n-site sub-haplotype for the degree to which it correlates with the clinical outcome values or other phenotype data, and calculate the p-value for the degree of correlation;
(f) computer-readable program code for causing a computer to store for further processing those n-site sub-haplotypes whose p-value for the degree of correlation does not exceed the first cut-off value;
(g) computer-readable program code for causing a computer to generate all possible pair-wise combinations among and between the saved SNPs and saved sub-haplotypes, to produce new subhaplotypes with increased values of n;
(h) computer-readable program code for causing a computer to repeat steps (e) through (g) until either (i) no new sub-haplotypes can be generated, or (ii) no further sub-haplotypes having n less than a pre-selected or user-selected limit can be generated.
-
-
166. A computer programmed to determine polymorphic sites or sub-haplotypes that correlate with a clinical response or outcome of interest, or other phenotype of interest, the computer comprising a memory having at least one region for storing computer executable program code and a processor for executing the program code stored in memory, wherein the program code includes:
-
(a) computer-readable program code for causing a computer to access a database containing single gene haplotype information for one or more genes, and clinical response, outcome data, or other phenotype data from a cohort of subjects;
(b) computer-readable program code for causing a computer to statistically analyze each single gene haplotype for the degree to which it correlates with the clinical response, outcome, or phenotype of interest, and to generate a numerical measure of the degree of correlation;
(c) computer-readable program code for causing a computer to store for further processing those haplotypes whose numerical measure of the degree of correlation exceeds a first cut-off value;
(d) computer-readable program code for causing a computer to generate, for each haplotype composed of m polymorphic sites, all possible sub-haplotypes having a single site masked, so as to provide a set of m−
n site sub-haplotypes where n=1;
(e) computer-readable program code for causing a computer to statistically analyze each newly generated sub-haplotype for the degree to which it correlates with the clinical response, outcome, or phenotype of interest, and calculating a numerical measure of the degree of correlation;
(f) computer-readable program code for causing a computer to save for further processing those sub-haplotypes whose numerical measure of the degree of correlation exceeds the first cut-off value;
(g) computer-readable program code for causing a computer to generate, from the saved sub-haplotypes, all possible sub-haplotypes having one additional site masked;
(h) computer-readable program code for causing a computer to repeat steps (e) through (g) until either (i) no new sub-haplotypes have a degree of correlation which exceeds the first cut-off value, or (ii) no further sub-haplotypes having more unmasked sites than a pre-selected limit can be generated. - View Dependent Claims (167, 170)
-
-
168. A computer programmed to determine polymorphic sites or sub-haplotypes that correlate with a clinical response or outcome of interest, or other phenotype of interest, the computer comprising a memory having at least one region for storing computer executable program code and a processor for executing the program code stored in memory, wherein the program code includes:
-
(a) computer-readable program code for causing a computer to access a database containing single gene haplotype information for one or more genes, and clinical response, outcome data, or other phenotype data from a cohort of subjects;
(b) computer-readable program code for causing a computer to statistically analyze each single gene haplotype for the degree to which it correlates with the clinical response, outcome, or phenotype of interest, and to calculate the p-value for the degree of correlation;
(c) computer-readable program code for causing a computer to store for further processing those haplotypes whose p-value for the degree of correlation does not exceed a first cut-off value;
(d) computer-readable program code for causing a computer to generate, for each haplotype composed of m polymorphic sites, all possible sub-haplotypes having a single site masked, so as to provide a set of m−
n site sub-haplotypes where n=1;
(e) computer-readable program code for causing a computer to statistically analyze each newly generated sub-haplotype for the degree to which it correlates with the clinical response, outcome, or phenotype of interest, and calculating the p-value for the degree of correlation;
(f) computer-readable program code for causing a computer to save for further processing those sub-haplotypes whose p-value for the degree of correlation does not exceed the first cut-off value;
(g) computer-readable program code for causing a computer to generate, from the saved sub-haplotypes, all possible sub-haplotypes having one additional site masked;
(h) computer-readable program code for causing a computer to repeat steps (e) through (g) until either (i) no new sub-haplotypes have a p-value which does not the first cut-off value, or (ii) no further sub-haplotypes having more unmasked sites than a pre-selected limit can be generated. - View Dependent Claims (169)
-
- 171. A data structure for storing and organizing biological information, stored on a computer-readable medium and accessible by a processor, which comprises a single parent table which is adapted for storing, organizing, and retrieving a plurality of genetic features by the relative positional relationships between the genetic features.
-
176. A method for storing and organizing biological information, which comprises
(a) providing a data structure comprising a single parent table which is adapted for storing, organizing, and retrieving a plurality of genetic features by the relative positional relationships between the genetic features; - and
(b) positioning a first genetic feature onto a second genetic feature.
- and
-
183. A data structure for storing and organizing biological information, stored on a computer-readable medium and accessible by a processor, which comprises at least two different fields, one of which includes a plurality of genetic features, and the other of which includes relative positional relationships between the genetic features.
Specification