METHOD OF GENERATING AN OPTIMIZED, DIVERSE POPULATION OF VARIANTS
First Claim
Patent Images
1. A method of selecting an optimized, diverse set of molecular variants from a plurality of molecular variants, the method comprising:
- (a) selecting one or more objectives for optimization of a plurality of molecular variants, wherein each molecular variant in the plurality is described by two or more objective data;
(b) determining a first Pareto front membership for each of the plurality of molecule variants based on the objectives of optimization;
(c) setting optimization parameters, wherein the optimization parameters comprise;
(i) number nvar of molecular variants to create;
(ii) a molecular variant population size popSize;
(iii) a crossover rate crossrate;
(iv) a mutation rate mutrate;
(v) a fitness function comprising a penalty fitness function and an overall fitness function, wherein the penalty fitness function is based on niche counting and the overall fitness function is based on the location of the molecular variant in a descending Pareto front divided by the shared number of molecular variants within each front; and
(vi) specific molecular variants to be included in the nvar set of molecular variants; and
(d) identifying a search space of acceptable molecular variants;
(e) generating a random population of genomes from the search space of acceptable molecular variants using a selection operator; and
(f) selecting a first set of nvar molecular variants by applying a crossover operator, a mutation operator, a repair operator, and a fitness operator to the random population of genomes.
2 Assignments
0 Petitions
Accused Products
Abstract
The present disclosure relates to methods of rapidly and efficiently searching biologically-related data space to identify a population set maximally diverse and optimized for sets of desired properties. More specifically, the disclosure provides methods of identifying a diverse, evolutionary separated bio-molecules with desired properties from complex bio-molecule libraries. The disclosure additionally provides digital systems and software for performing these methods.
8 Citations
31 Claims
-
1. A method of selecting an optimized, diverse set of molecular variants from a plurality of molecular variants, the method comprising:
-
(a) selecting one or more objectives for optimization of a plurality of molecular variants, wherein each molecular variant in the plurality is described by two or more objective data; (b) determining a first Pareto front membership for each of the plurality of molecule variants based on the objectives of optimization; (c) setting optimization parameters, wherein the optimization parameters comprise; (i) number nvar of molecular variants to create; (ii) a molecular variant population size popSize; (iii) a crossover rate crossrate; (iv) a mutation rate mutrate; (v) a fitness function comprising a penalty fitness function and an overall fitness function, wherein the penalty fitness function is based on niche counting and the overall fitness function is based on the location of the molecular variant in a descending Pareto front divided by the shared number of molecular variants within each front; and (vi) specific molecular variants to be included in the nvar set of molecular variants; and (d) identifying a search space of acceptable molecular variants; (e) generating a random population of genomes from the search space of acceptable molecular variants using a selection operator; and (f) selecting a first set of nvar molecular variants by applying a crossover operator, a mutation operator, a repair operator, and a fitness operator to the random population of genomes. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A method of selecting an optimized, diverse set of molecular variants from a plurality of molecular variants, the method comprising:
-
(a) selecting one or more objectives for optimization of a plurality of molecular variants, wherein each molecular variant in the plurality is described by two or more objective data; (b) determining a first Pareto front membership for each of the plurality of molecule variants based on the objectives of optimization; (c) setting optimization parameters, wherein the optimization parameters comprise; (i) number nvar of molecular variants to create; (ii) a molecular variant population size popSize; (iii) a crossover rate crossrate; (iv) number of generations to create nGen; (v) a mutation rate mutrate; (vi) a fitness function comprising a penalty fitness function and an overall fitness function, wherein the penalty fitness function is based on niche counting and the overall fitness function is based on the location of the molecular variant in a descending Pareto front divided by the shared number of molecular variants within each front; and (vii) specific molecular variants to be included in the nvar set of molecular variants; and (d) identifying a search space of acceptable molecular variants; (e) generating a random population of genomes from the search space of acceptable molecular variants; (f) selecting a first set of genomes of size popSize from the random population, wherein each genome consisting of nvar molecular variants is created by applying a selection operator, a crossover operator, a mutation operator, a repair operator, and a fitness operator to the random population of genomes; and (g) returning the genome with the highest fitness as the final, optimized, diverse nvar set of molecular variants. - View Dependent Claims (10, 11, 12, 13, 14, 15)
-
-
16. A computer program product comprising a machine readable medium having program instructions for selecting an optimized, diverse set of molecular variants from a plurality of molecular variants, the program instructions comprising:
-
(a) code for receiving an objective data set representing two or more properties of each molecular variant in a plurality of molecular variants; (b) code for generating a first Pareto front membership of all molecular variants based on the one or more objectives for optimization; (c) code for setting optimization parameters, wherein the optimization parameters comprise; (i) number nvar of molecular variants to create; (ii) a molecular variant population size popSize; (iii) a crossover rate crossrate; (iv) a mutation rate mutrate; (v) a fitness function comprising a penalty fitness function and an overall fitness function, wherein the penalty fitness is based on niche count and the overall fitness function is based on location in a descending Pareto front divided by the shared number of molecular variants within each front; and (vi) specific molecular variants to be included in the nvar of molecular variants; and (d) code for identifying a search space of acceptable molecular variants; (e) code for generating a random population of genomes from the search space of acceptable molecular variants by applying a selection operator; and (f) code for applying a set of operators comprising a crossover operator, a mutation operator, a fitness operator, and a repair operator on the random plurality of genomes to select the nvar set of optimized, diverse molecular variants. - View Dependent Claims (17)
-
-
18. A computer program product comprising a machine readable medium having program instructions for selecting an optimized, diverse set of molecular variants from a plurality of molecular variants, the program instructions comprising:
-
(a) code for receiving an objective data set representing two or more properties of each molecular variant in a plurality of molecular variants; (b) code for generating a first Pareto front membership of all molecular variants based on the one or more objectives for optimization; (c) code for setting optimization parameters, wherein the optimization parameters comprise; (i) number nvar of molecular variants to create; (ii) a molecular variant population size popSize; (iii) a crossover rate crossrate; (iv) number of generations to create nGen; (v) a mutation rate mutrate; (vi) a fitness function comprising a penalty fitness function and an overall fitness function, wherein the penalty fitness is based on niche count and the overall fitness function is based on the location of the molecular variant in a descending Pareto front divided by the shared number of molecular variants within each front; and (vii) specific molecular variants to be included in the nvar of molecular variants; and (d) code for identifying a search space of acceptable molecular variants; (e) code for generating a random population of genomes from the search space of acceptable molecular variants; (f) code for selecting a first set of genomes of size popSize from the random population, wherein each genome consisting of nvar molecular variants is created by applying a selection operator, a crossover operator, a mutation operator, a repair operator, and a fitness operator to the random population of genomes; and (g) code for returning the genome with the highest fitness as the final, optimized, diverse nvar set of molecular variants. - View Dependent Claims (19)
-
-
20. A system for selecting an optimized, diverse set of molecular variants from a plurality of molecular variants, comprising:
-
(a) at least one computer comprising a database capable of storing an objective data set representing two or more properties of each molecular variant in a plurality of molecular variants; and (b) system software comprising one or more logic instructions for; receiving an objective data set representing two or more properties of each molecular variant; (ii) generating a first Pareto front membership of all molecular variants based on the one or more objectives for optimization; (iii) setting optimization parameters, wherein the optimization parameters comprise; (a) number nvar of molecular variants to create; (b) a molecular variant population size popSize; (c) a crossover rate crossrate; (d) a mutation rate mutrate; (e) a fitness function comprising a penalty fitness function and an overall fitness function, wherein the penalty fitness is based on niche counting and the overall fitness function is based on the location of the molecular variant in a descending Pareto front divided by the shared number of molecular variants; and (f) specific molecular variants to be included in the nvar set of molecular variants; and (iv) identifying a search space of acceptable molecular variants; (v) generating a random population of genomes from the search space of acceptable molecular variants by applying a selection operator; and (vi) applying to the random population of molecular variants a set of operators comprising a selection operator, a crossover operator, a mutation operator, a fitness operator, and a repair operator to select a set nvar of optimized, diverse molecular variants. - View Dependent Claims (21, 30)
-
-
22. A system for selecting an optimized, diverse set of molecular variants from a plurality of molecular variants, comprising:
-
(a) at least one computer comprising a database capable of storing an objective data set representing two or more properties of each molecular variant in a plurality of molecular variants; and (b) system software comprising one or more logic instructions for; (i) receiving an objective data set representing two or more properties of each molecular variant; (ii) generating a first Pareto front membership of all molecular variants based on the one or more objectives for optimization; (iii) setting optimization parameters, wherein the optimization parameters comprise; (a) number nvar of molecular variants to create; (b) a molecular variant population size popSize; (c) a crossover rate crossrate; (d) number of generations to create nGen; (e) a mutation rate mutrate; (f) a fitness function comprising a penalty fitness function and an overall fitness function, wherein the penalty fitness is based on niche counting and the overall fitness function is based on the location of the molecular variant in a descending Pareto front divided by the shared number of molecular variants; and (g) specific molecular variants to be included in the nvar set of molecular variants; and (iv) identifying a search space of acceptable molecular variants; (v) generating a random population of genomes from the search space of acceptable molecular variants; (vi) selecting a first set of genomes of size popSize from the random population, wherein each genome consisting of nvar molecular variants is created by applying a selection operator, a crossover operator, a mutation operator, a repair operator, and a fitness operator to the random population of genomes; and (vii) returning the genome with the highest fitness as the final, optimized, diverse nvar set of molecular variants. - View Dependent Claims (23, 31)
-
-
24-29. -29. (canceled)
Specification