Gene expression programming algorithm
First Claim
1. An apparatus for generating a program that solves a technical problem, the apparatus comprising:
- a computer configured by software to;
generate an initial population of representations of programs that include a plurality of prime numbers; and
evolve a succession of generations of representations of programs derived from the initial population;
for one or more of said succession of generations, test programs represented by said representations of programs and when a fitness criteria that depends, at least, on an ability of said programs represented by said representations of programs to solve said technical problem is satisfied output one or more programs that satisfy said fitness criteria.
2 Assignments
0 Petitions
Accused Products
Abstract
A gene expression programming genetic algorithm for performing symbolic regression is provided. The algorithm avoids expression bloating and over fitting by employing a fitness function that depends inversely on the mathematical expression complexity. Members of a population that are evolved by the algorithm are represented as a set arrays (e.g., in the form of a matrix) of indexes that reference operands and operators, thus facilitating selection, mutation, and cross over operations conducted in the course of evolving the population. The algorithm comprises a syntax checking part that may be applied to population members without their having to be converted to executable programs first. An object-oriented programming language data structure is providing for encapsulating basic data for each codon (e.g., operand, operator) used by the algorithm.
12 Citations
31 Claims
-
1. An apparatus for generating a program that solves a technical problem, the apparatus comprising:
-
a computer configured by software to; generate an initial population of representations of programs that include a plurality of prime numbers; and evolve a succession of generations of representations of programs derived from the initial population; for one or more of said succession of generations, test programs represented by said representations of programs and when a fitness criteria that depends, at least, on an ability of said programs represented by said representations of programs to solve said technical problem is satisfied output one or more programs that satisfy said fitness criteria. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. An apparatus for generating a program that solves a technical problem, the apparatus comprising:
-
a computer configured by software to; recursively generate a succession of populations of representations of programs, wherein each representation includes; identifications of one or more operators selected from a set of operators; and identifications of one or more operands selected from a set of operands that includes constants and variables; wherein each operand in the set of operands, and each operator in the set of operators is associated with a cost; for each kth population in the succession of populations generate a set of programs represented in the kth population of representations; derive a first measure of fitness of each program in the set of programs by a process including; running the program on one or more sets of test data and comparing output values to predetermined values to obtain a measure of residual error; derive a second measure of fitness of each program in the set of programs by a process including; summing the cost of each operator and operand in the program; selectively duplicating individual representations of programs in each successive generation based on the first and second measures of fitness; and when a combined fitness criteria that depends on said first and second measures of fitness is satisfied output one or more programs that satisfy said combined fitness criteria. - View Dependent Claims (8)
-
-
9. An apparatus for generating a program that solves a technical problem, the apparatus comprising:
-
a computer configured by software to; represent each of a succession of populations of programs as a set of arrays of indexes, wherein each array represents a program and one or more indexes in the set of arrays of indexes point to representations of program elements selected from a group of elements consisting of operators and operands; for each kth population of the succession of populations; generate a program represented by each of one or more of the set of arrays in the population; derive a measure of fitness of each program; select one or more high fitness arrays from the set of arrays based on the measure of fitness for carry over to a (k+1)th generation; perform a cross over operation between one or more of the high fitness arrays carried over to the (k+1)th generation; when said measure of fitness of a particular program satisfies a fitness criteria output said particular program. - View Dependent Claims (10, 11, 12)
-
-
13. An apparatus for generating a program that solves a technical problem, the apparatus comprising:
-
a computer configured by software to; recursively generate a succession of generations of a population of sequences of symbols wherein each sequence of symbols represents a program; for each generation determining if each sequence of symbols can be translated into a viable program by; for each sequence of symbols initializing a required length variable; starting with a first symbol of each sequence, and for a succession of kth symbols within each sequence; increasing the required length variable by a number of symbols necessitated by syntax rules to form a viable operation with the kth symbol; in the case that the required length variable after incrementing is equal to k reporting valid syntax; and in the case that the required length variable when added to k exceeds a maximum sequence length reporting invalid syntax. - View Dependent Claims (14, 15)
-
-
16. A computer readable medium containing programming instructions for determining a mathematical expression that describes a set of real world data generating a program that solves a technical problem, including programming instructions for:
-
generating an initial population of representations of mathematical programs that include a plurality of prime numbers; and evolving a succession of generations derived from the initial population; for one or more of said succession of generations, testing programs represented by said representations of programs and when a fitness criteria that depends, at least, on an ability of said programs represented by said representations of programs to solve said technical problem is satisfied outputting one or more programs that satisfy said fitness criteria. - View Dependent Claims (17, 18, 19, 20, 21)
-
-
22. A computer readable medium containing programming instructions for generating a program that solves a technical problem, including programming instructions for:
-
recursively generating a succession of populations of representations of programs, wherein each representation includes; identifications of one or more operators selected from a set of operators; and identifications of one or more operands selected from a set of operands that includes constants and variables; wherein each operand in the set of operands, and each operator in the set of operators is associated with a cost; for each kth population in the succession of populations generating a set of programs represented in the kth population of representations; deriving a first measure of fitness of each program in the set of programs by a process including; running the program on one or more sets of test data and comparing output values to predetermined values to obtain a measure of residual error; deriving a second measure of fitness of each program in the set of programs by a process including; summing the cost of each operator and operand in the program; selectively duplicating individual representations of programs in each successive generation based on the first and second measures of fitness; and when a combined fitness criteria that depends on said first and second measures of fitness is satisfied outputting one or more programs that satisfy said combined fitness criteria. - View Dependent Claims (23)
-
-
24. A computer readable medium containing programming instructions for generating a program that solves a technical problem, including programming instructions for:
-
representing each of a succession of populations of programs as a set of arrays of indexes, wherein each array represents a program and one or more indexes point to representations of a program element selected from a group of elements consisting of operators and operands; for each kth population of the succession of populations; generating a program represented by each array in the population; deriving a measure of fitness of each generated program; selecting one or more high fitness arrays based on the measure of fitness for carry over to a (k+1)th generation; performing a cross over operation between one or more of the high fitness arrays carried over to the (k+1)th generation; when said measure of fitness of a particular program satisfies a fitness criteria outputting said particular program. - View Dependent Claims (25, 26, 27)
-
-
28. A computer readable medium containing programming instructions for generating a program that solves a technical problem, including programming instructions for:
recursively generating a succession of generations of a population of sequences of symbols wherein each sequence of symbols represents a program; for each generation determining if each sequence of symbols can be translated into a viable program by; for each sequence of symbols initializing a required length variable; starting with a first symbol of each sequence, and for a succession of kth symbols within each sequence;
increasing the required length variable by a number of symbols necessitated by syntax rules to form a viable operation with the kth symbol;
in the case that the required length variable after incrementing is equal to k reporting valid syntax; and
in the case that the required length variable when added to k exceeds a maximum sequence length reporting invalid syntax.- View Dependent Claims (29, 30)
-
31. A computer readable medium having stored there on a data structure comprising:
an objected oriented programming language class including; a codon name instance variable; a codon type instance variable for storing a first number that is indicative of a number of codons that are to be associated with a codon specified in the codon name instance variable; a codon value instance variable for storing a value associated with the codon specified in the codon name instance variable; a codon cost instance variable for storing a second number that is indicative of a degree to which an operator or operand specified by the codon specified in the codon name instance variable increase complexity of a mathematical expression; and a codon index instance variable for storing an index associated with each instance of the object-oriented programming language class.
Specification