System and method for predicting chromosomal regions that control phenotypic traits
First Claim
1. A method of associating a phenotype with one or more candidate chromosomal regions in a genome of an organism using a phenotypic data structure that represents a difference in a phenotype between different strains of said organism, said genome including a plurality of loci, said method comprising:
- establishing a genotypic data structure, said genotypic data structure corresponding to a locus selected from said plurality of loci, said genotypic data structure representing a variation of at least one component of said locus between different strains of said organism;
comparing said phenotypic data structure to said genotypic data structure to form a correlation value; and
repeating said establishing and comparing steps for each locus in said plurality of loci, thereby identifying one or more genotypic data structures that form a high correlation value relative to all other genotypic data structures that are compared to said phenotypic data structure during said comparing step;
wherein the loci that correspond to said one or more genotypic data structures that form a high correlation value represent said one or more candidate chromosomal regions.
4 Assignments
0 Petitions
Accused Products
Abstract
A method of associating a phenotype with one or more candidate chromosomal regions in a genome of an organism includes the step of deriving a phenotypic data structure that represents differences in phenotypes between different strains of the organism. Further, a genotypic data structure is established. The genotypic data structure corresponds to a locus selected from a plurality of loci in the genome of the organism. The genotypic data structure represents variations of at least one component of the locus between different strains of the organism. The phenotypic data structure is compared to the genotypic data structure to form a correlation value. The process of establishing a genotypic data structure and comparing it to the phenotypic data structure is repeated for each locus in the plurality of loci, thereby identifying one or more genotypic data structures that form a high correlation value relative to all other compared genotypic data structures. The loci that correspond to the one or more genotypic data structures having a high correlation value represent the one or more candidate chromosomal regions.
21 Citations
71 Claims
-
1. A method of associating a phenotype with one or more candidate chromosomal regions in a genome of an organism using a phenotypic data structure that represents a difference in a phenotype between different strains of said organism, said genome including a plurality of loci, said method comprising:
-
establishing a genotypic data structure, said genotypic data structure corresponding to a locus selected from said plurality of loci, said genotypic data structure representing a variation of at least one component of said locus between different strains of said organism;
comparing said phenotypic data structure to said genotypic data structure to form a correlation value; and
repeating said establishing and comparing steps for each locus in said plurality of loci, thereby identifying one or more genotypic data structures that form a high correlation value relative to all other genotypic data structures that are compared to said phenotypic data structure during said comparing step;
wherein the loci that correspond to said one or more genotypic data structures that form a high correlation value represent said one or more candidate chromosomal regions. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50)
-
-
26. A computer program product for use in conjunction with a computer system, the computer program product comprising a computer readable storage medium and a computer program mechanism embedded therein, the computer program mechanism comprising:
-
a genotypic database for storing variations in genomic sequences of a plurality of strains of an organism;
a phenotypic data structure that represents a difference in a phenotype between different strains of said organism; and
a program module for associating a phenotype with one or more candidate chromosomal regions in a genome of said organism, said genome including a plurality of loci, said program module comprising;
instructions for establishing a genotypic data structure, said genotypic data structure corresponding to a locus selected from a plurality of loci, said genotypic data structure representing a variation of at least one component of said locus between different strains of said organism stored in said genotypic database;
instructions for comparing said phenotypic data structure to said genotypic data structure to form a correlation value; and
instructions for repeating said instructions for establishing and instructions for comparing for each locus in said plurality of loci, thereby identifying one or more genotypic data structures that form a high correlation value relative to all other genotypic data structures that are compared to said phenotypic data structure by said instructions for comparing;
wherein the loci that correspond to said one or more genotypic data structures that form a high correlation value represent said one or more candidate chromosomal regions.
-
-
51. A computer program product for use in conjunction with a computer system, the computer program product comprising a computer readable storage medium and a computer program mechanism embedded therein, the computer program mechanism comprising:
-
a genotypic database for storing variations in genomic sequences of a plurality of strains of an organism;
a phenotypic data structure, each element in said phenotypic data structure representing a difference in said phenotype between different strains of said organism; and
a program module for associating a phenotype with one or more candidate chromosomal regions in a genome of said organism, said genome including a plurality of loci, said program module comprising;
instructions for identifying a genotypic data structure, said genotypic data structure corresponding to a locus selected from said plurality of loci, each element in said genotypic data structure representing a variation of at least one component of said locus between different strains of said organism;
instructions for comparing said phenotypic data structure to said genotypic data structure to form a correlation value; and
instructions for repeating said instructions for identifying and said instructions for comparing, for each locus in said plurality of loci, thereby identifying one or more genotypic data structures that form a high correlation value relative to all other genotypic data structures that are compared to said phenotypic data structure by said instructions for comparing;
wherein the loci that correspond to said one or more genotypic data structures that form a high correlation value represent said one or more candidate chromosomal regions.
-
-
52. A computer system for associating a phenotype with one or more candidate chromosomal regions in a genome of an organism, said genome including a plurality of loci, the computer system comprising:
-
a central processing unit;
a memory, coupled to the central processing unit, the memory storing;
a genotypic database for storing variations in genomic sequences of a plurality of strains of said organism;
a phenotypic data structure that represents a difference in a phenotype between different strains of said organism; and
a program module, said program module comprising;
instructions for establishing a genotypic data structure, said genotypic data structure corresponding to a locus selected from a plurality of loci, said genotypic data structure representing a variation of at least one component of said locus between different strains of said organism stored in said genotypic database;
instructions for comparing said phenotypic data structure to said genotypic data structure to form a correlation value; and
instructions for repeating said instructions for establishing and said instructions for comparing, for each locus in said plurality of loci, thereby identifying one or more genotypic data structures that form a high correlation value relative to all other genotypic data structures that are compared to said phenotypic data structure by said instructions for comparing;
wherein the loci that correspond to said one or more genotypic data structures that form a high correlation value represent said one or more candidate chromosomal regions. - View Dependent Claims (53, 54, 55, 56, 57, 58, 59, 60, 61)
-
-
62. A method of associating a phenotype with one or more candidate chromosomal regions in a genome of an organism using a phenotypic data structure that represents alterations in phenotypes between different strains in a plurality of strains of said organism,
said phenotypic data structure including a description of each said alteration and individual elements of said phenotypic data structure including an amount of alteration between different strains of said organism selected from said plurality of strains of said organism, said genome including a plurality of loci, each said loci representing one or more positions within said genome, said method comprising: -
establishing a unique individual variation matrix for each said one or more positions represented by said loci, wherein an element within each said unique individual variation matrix represents an allelic comparison between different strains of said organism that are selected from said plurality of strains of said organism;
summing corresponding elements in each said unique individual matrix to form a genotypic data structure;
comparing said phenotypic data structure to said genotypic data structure to form a correlation value; and
repeating said establishing, summing and comparing steps, for each locus in said plurality of loci, thereby identifying one or more genotypic data structures that form a high correlation value relative to all other genotypic data structures that are compared to said phenotypic data structure during said comparing step;
wherein the loci that correspond to said one or more genotypic data structures that form a high correlation value represent said one or more candidate chromosomal regions associated with said phenotype.
-
-
63. A computer program product for use in conjunction with a computer system, the computer program product comprising a computer readable storage medium and a computer program mechanism embedded therein, the computer program mechanism comprising:
-
a genotypic database for storing variations in genomic sequences of a plurality of strains of an organism;
a phenotypic data structure that represents alterations in phenotypes between different strains of said organism selected from said plurality of strains of said organism, said phenotypic data structure including a description of each said alteration and individual elements of said phenotypic data structure including an amount of alteration between different strains in said plurality of strains of said organism; and
a program module for associating a phenotype with one or more candidate chromosomal regions in a genome of said organism, said genome including a plurality of loci, each said loci representing one or more positions within said genome, said program module comprising;
instructions for establishing a unique individual variation matrix for each said one or more positions represented by said loci, wherein an element within each said unique individual variation matrix represents an allelic comparison of values stored in said genotypic database between different strains of said organism that are selected from said plurality of strains of said organism;
instructions for summing corresponding elements in each said unique individual matrix to form a genotypic data structure;
instructions for comparing said phenotypic data structure to said genotypic data structure to form a correlation value; and
instructions for repeating said instructions for establishing, summing and comparing, for each locus in said plurality of loci, thereby identifying one or more genotypic data structures that form a high correlation value relative to all other genotypic data structures that are compared to said phenotypic data structure during said comparing step;
wherein the loci that correspond to said one or more genotypic data structures that form a high correlation value represent said one or more candidate chromosomal regions associated with said phenotype.
-
-
64. A computer system for associating a phenotype with one or more candidate chromosomal regions in a genome of an organism, said genome including a plurality of loci, each said loci representing one or more positions within said genome, said program module comprising:
-
a central processing unit;
a memory, coupled to the central processing unit, the memory storing;
a genotypic database for storing variations in genomic sequences of a plurality of strains of said organism;
a phenotypic data structure that represents alterations in phenotypes between different strains in said plurality of strains of said organism, said phenotypic data structure including a description of each said alteration and individual elements of said phenotypic data structure including an amount of alteration between different strains in said plurality of strains of said organism; and
a program module, said program module comprising;
instructions for establishing a unique individual variation matrix for each said one or more positions represented by said loci, wherein an element within each said unique individual variation matrix represents an allelic comparison of values stored in said genotypic database between different strains of said organism that are selected from said plurality of strains of said organism;
instructions for summing corresponding elements in each said unique individual matrix to form a genotypic data structure;
instructions for comparing said phenotypic data structure to said genotypic data structure to form a correlation value; and
instructions for repeating said instructions for establishing, summing and comparing, for each locus in said plurality of loci, thereby identifying one or more genotypic data structures that form a high correlation value relative to all other genotypic data structures that are compared to said phenotypic data structure during said comparing step;
wherein the loci that correspond to said one or more genotypic data structures that form a high correlation represent said one or more candidate chromosomal regions associated with said phenotype.
-
-
65. A method of determining a portion of a genome of an organism that is responsive to a perturbation, the method comprising:
-
producing a first phenotypic data structure that represents a difference in a first phenotype between different strains of said organism, said genome including a plurality of loci, wherein said first phenotype is measured for each said different strain of said organism when each said different strain is in a first state;
establishing a genotypic data structure, said genotypic data structure corresponding to a locus selected from said plurality of loci, said genotypic data structure representing a variation of at least one component of said locus between different strains of said organism;
comparing said first phenotypic data structure to said genotypic data structure to form a correlation value;
repeating said establishing and comparing steps for each locus in said plurality of loci, thereby identifying a first set of genotypic data structures that form a high correlation value relative to all other genotypic data structures that are compared to said first phenotypic data structure during said comparing step;
computing a second phenotypic data structure that represents a difference in a second phenotype between different strains of said organism, wherein said second phenotype is measured for each said different strain of said organism when each said different strain is in a second state that is produced by exposing each said different strain of said organism to a perturbation;
correlating said second phenotypic data structure to said genotypic data structure to form a correlation value;
repeating said computing and correlating steps for each locus in said plurality of loci, thereby identifying a second set of genotypic data structures that form a high correlation value relative to all other genotypic data structures that are compared to said second phenotypic data structure during said correlating step; and
resolving a dissimilarity in said first set of genotypic data structures and said second set of genotypic structures, thereby determining said portion of said genome of said organism that is responsive to said perturbation. - View Dependent Claims (66, 67, 69, 70)
-
-
68. A computer program product for use in conjunction with a computer system, the computer program product comprising a computer readable storage medium and a computer program mechanism embedded therein, the computer program mechanism comprising:
a program module for determining a portion of a genome of an organism that is responsive to a perturbation, the method comprising;
instructions for producing a first phenotypic data structure that represents a difference in a first phenotype between different strains of said organism, said genome including a plurality of loci, wherein said first phenotype is measured for each said different strain of said organism when each said different strain is in a first state;
instructions for establishing a genotypic data structure, said genotypic data structure corresponding to a locus selected from said plurality of loci, said genotypic data structure representing a variation of at least one component of said locus between different strains of said organism;
instructions for comparing said first phenotypic data structure to said genotypic data structure to form a correlation value;
instructions for repeating said instructions for establishing and said instructions for comparing for each locus in said plurality of loci, thereby identifying a first set of genotypic data structures that form a high correlation value relative to all other genotypic data structures that are compared to said first phenotypic data structure during said comparing step;
instructions for computing a second phenotypic data structure that represents a difference in a second phenotype between different strains of said organism, wherein said second phenotype is measured for each said different strain of said organism when each said different strain is in a second state that is produced by exposing each said different strain of said organism to a perturbation;
instructions for correlating said second phenotypic data structure to said genotypic data structure to form a correlation value;
instructions for repeating said computing and correlating steps for each locus in said plurality of loci, thereby identifying a second set of genotypic data structures that form a high correlation value relative to all other genotypic data structures that are compared to said second phenotypic data structure during said correlating step; and
instructions for resolving a dissimilarity in said first set of genotypic data structures and said second set of genotypic structures, thereby determining said portion of said genome of said organism that is responsive to said perturbation.
-
71. A computer program product for use in conjunction with a computer system, the computer program product comprising a computer readable storage medium and a computer program mechanism embedded therein, the computer program mechanism comprising:
a program module for associating a phenotype with one or more candidate chromosomal regions in a genome of said organism, said genome including a plurality of loci, said program module comprising;
instructions for accessing a genotypic data structure, said genotypic data structure corresponding to a locus selected from a plurality of loci, said genotypic data structure representing a variation of at least one component of said locus between different strains of said organism stored in a genotypic database;
instructions for comparing a phenotypic data structure to said genotypic data structure to form a correlation value; and
instructions for repeating said instructions for establishing and instructions for comparing for each locus in said plurality of loci, thereby identifying one or more genotypic data structures that form a high correlation value relative to all other genotypic data structures that are compared to said phenotypic data structure by said instructions for comparing;
wherein the loci that correspond to said one or more genotypic data structures that form a high correlation value represent said one or more candidate chromosomal regions.
Specification