EuroGP2006

9th European Conference on Genetic Programming

The annual EuroGP series are the premier conferences in Europe devoted entirely to genetic programming. The standard is high (only 32% of submissions accepted for oral presentation in 2005 and 54% if posters are taken into account) and reviewing is double-blind. The conference is a mixture of oral presentations and poster sessions. ALL accepted papers are published as papers in a volume of the Springer Lecture Notes in Computer Science, oral presentations get 12 pages and posters get 10 pages.

EuroGP conferences are always enjoyable and offer good opportunities for informal contact with fellow researchers in a friendly and relaxed setting. EuroGP2006 will be held Budapest, Hungary, in conjunction with EvoCOP2006 and EvoWorkshops2006.

High quality papers are sought on topics strongly related to genetic programming, ranging from theoretical work to innovative applications.

Web address: http://www.evonet.info/eurogp2006

Topics include

Organising Committee

Program Chairs
Pierre Collet
Pierre.Collet AT univ-littoral DOT fr
Laboratoire d'Informatique du Littoral, France
 
Marco Tomassini
Marco.Tomassini AT iismail DOT unil DOT ch
Universitey of Lausanne, Switzerland
 
Publication Chair
Marc Ebner
ebner AT informatik DOT uni-wuerzburg DOT de
Universität Würzburg, Germany
 
Local Chair
Anikó Ekárt
ekart AT sztaki DOT hu
Hungarian Academy of Sciences
 
Publicity Chair
Steven Gustafson
smg AT cs DOT nott DOT ac DOT uk
University of Nottingham, UK

Note: the e-mail addresses are masked for spam protection.


Accepted papers: titles and abstracts

802.11 De-authentication Attack Detection using Genetic Programming
LaRoche, Patrick and Zincir-Heywood, A. Nur

This paper presents a genetic programming approach to detect deauthentication attacks on wireless networks based on the 802.11 protocol. To do so we focus on developing an appropriate fitness function and feature set. Results show that the intrusion system developed not only performs incredibly well - 100 percent detection rate and 0.5 percent false positive rate - but also developed a solution that is general enough to detect similar attacks, such as disassociation attacks, that were not present in the training data.


A Divide & Conquer strategy for improving efficiency and probability of success in Genetic Programming
Fillon, Cyril and Bartoli, Alberto

A common method for improving a genetic programming search on difficult problems is either multiplying the number of runs or increasing the population size. In this paper we propose a new search strategy which attempts to obtain a higher probability of success with smaller amounts of computational resources. We call this model Divide & Conquer since our algorithm initially partitions the search space in smaller regions that are explored independently of each other. Then, our algorithm collects the most competitive individuals found in each partition and exploits them in order to get a solution. We benchmarked our proposal on three problem domains widely used in the literature. Our results show a significant improvement of the likelihood of success while requiring less computational resources than the standard algorithm.


A Genetic Programming Approach to Solomonoff's Probabilistic Induction
De Falco, Ivanoe, Della Cioppa, Antonio, Maisto, Domenico, and Tarantino, Ernesto

In the context of Solomonoff's Inductive Inference theory, Induction operator plays a key role in modeling and correctly predicting the behavior of a given phenomenon. Unfortunately, this operator is not algorithmically computable. The present paper deals with a Genetic Programming approach to Inductive Inference, with reference to Solomonoff's algorithmic probability theory, that consists in evolving a population of mathematical expressions looking for the `optimal' one that generates a collection of data and has a maximal a priori probability. Validation is performed on Coulomb's Law, on the Henon series and on the Arosa Ozone time series. The results show that the method is effective in obtaining the analytical expression of the first two problems, and in achieving a very good approximation and forecasting of the third.


A Less Destructive, Context-aware Crossover Operator for GP
Majeed, Hammad and Ryan, Conor
(Nominated for Best Paper Award)

Standard GP crossover is widely accepted as being a largely {\em destructive} operator, creating many poor offspring in the search for better ones. One of the major reasons for its destructiveness is its disrespect for the context of swapped subtrees in their respective parent trees when creating offspring. At times, this hampers GP's performance considerably, and results in populations with {\em low} average fitness values. Many attempts have been made to make it a more constructive crossover, mostly by preserving the context of the selected subtree in the offspring. Although successful at preserving context, none of these methods provide the opportunity to discover new and better contexts for exchanged subtrees. We introduce a {\em context-aware} crossover operator which operates by identifying all possible contexts for a subtree, and evaluating each of them. The context that produces the highest fitness is used to create a child which is then passed into the next generation. We have tested its performance on many benchmark problems. It has shown better results than the standard GP crossover operator, using either the same number or fewer individual evaluations. Furthermore, the average fitness of populations using this scheme improves considerably, and programs produced in this way are much smaller than those produced using standard crossover.


AQUAGP: Approximate QUery Answers Using Genetic Programming
Peltzer, Jason B., Teredesai, Ankur M., and Reinard, Garrett L.

Speed, cost, and accuracy are crucial performance parameters while evaluating the quality of information and query retrieval within any Database Management System. For some queries it may be possible to derive a similar result set using an approximate query answering algorithm or tool when the \textit{perfect/exact} results are not required. Query approximation becomes useful when the following conditions are true: (a) a high percentage of the relevant data is retrieved correctly, (b) irrelevant or extra data is minimized, and (c) an approximate answer (if available) results in significant (notable) savings in terms of the overall query cost and retrieval time. In this paper we discuss a novel approach for approximate query answering using Genetic Programming (GP) paradigms. We have developed an evolutionary computing based query space exploration framework which, given an input query and the database schema, uses tree-based GP to generate and evaluate approximate query candidates, automatically. We highlight and discuss various avenues of exploration and evaluate the success of our experiments based on the speed, cost, and accuracy of the results retrieved by the re-formulated (GP generated) queries and present the results on a variety of query types for TPC-benchmark and PKDD-benchmark datasets.


Blindbuilder : A new encoding to evolve Lego-like structures
Devert, Alexandre, Bredeche, Nicolas, and Schoenauer, Marc

This paper introduces a new representation for assemblies of small Lego-like elements: structures are indirectly encoded as construction plans. This representation shows some interesting properties such as hierarchy, modularity and easy constructibility checking by definition. Together with this representation, efficient GP operators are introduced that allow efficient and fast evolution, as witnessed by the results on two construction problems that demonstrate that the proposed approach is able to achieve both compactness and reusability of evolved components.


Dynamic Scheduling with Genetic Programming
Jakobovi\'c, Domagoj and Budin, Leo

This paper investigates the use of genetic programming in automatized synthesis of scheduling heuristics. The applied scheduling technique is priority scheduling, where the next state of the system is determined based on priority values of certain system elements. The evolved solutions are compared with existing scheduling heuristics for single machine dynamic problem and job shop scheduling with bottleneck estimation.


Emergent Generality of Adapted Locomotion Gaits of Simulated Snake-like Robot
Tanev, Ivan

In this work we consider the generality of locomotion gaits of simu-lated snake-like robot (Snakebot), adapted (via genetic programming, GP) to both (i) a challenging terrain and (ii) a partial mechanical damage. Discussing the emergence of common traits in these gaits, we elaborate on the strong cor-relation between their respective genotypes. We experimentally verify the gen-erality of the adapted gaits in different ?unexpected? environmental conditions and for various mechanical failures of the Snakebots. From an engineering standpoint, we suppose that in response to an eventual degradation of velocity, the Snakebot might activate a general locomotion gait, without the need to di-agnose and treat the concrete underlying reason for such degradation. We view this work as a step towards building real Snakebots, which are able to perform robustly in difficult environment.


Evolving crossover operators for function optimization
Diosan, Laura and Oltean, Miha
(Nominated for Best Paper Award)

A new model for evolving crossover operators for evolutionary function optimization is proposed in this paper. The model is a hybrid technique that combines a Genetic Programming (GP) algorithm and a Genetic Algorithm (GA). Each GP chromosome is a tree encoding a crossover operator used for function optimization. The evolved crossover is embedded into a standard Genetic Algorithm which is used for solving a particular problem. Several crossover operators for function optimization are evolved using the considered model. The evolved crossover operators are compared to the human-designed convex crossover. Numerical experiments show that the evolved crossover operators perform similarly or sometimes even better than standard approaches for several well-known benchmarking problems.


Genetic Programming, Validation Sets, and Parsimony Pressure
Gagn\'e, Christian, Schoenauer, Marc, Parizeau, Marc, and Tomassini, Marco

Fitness functions based on test cases are very common in Genetic Programming (GP). This process can be assimilated to a learning task, with the inference of models from a limited number of samples. This paper is an investigation on two methods to improve generalization in GP-based learning: 1) the selection of the best-of-run individuals using a three data sets methodology, and 2) the application of parsimony pressure in order to reduce the complexity of the solutions. Results using GP in a binary classification setup show that while the accuracy on the test sets is preserved, with less variances compared to baseline results, the mean tree size obtained with the tested methods is significantly reduced.


Geometric Crossover for Biological Sequences
Moraglio, Alberto, Poli, Riccardo, and Seehuus, Rolv

This paper extends a geometric framework for interpreting crossover and mutation to the case of sequences. This representation is important because it is the link between artificial evolution and biological evolution. We define and theoretically study geometric crossover for sequences under edit distance and show its intimate connection with the biological notion of sequence homology.


Incentive Method to Handle Constraints in Evolutionary Algorithms with a Case Study
Tsang, Edward and Jin, Nanlin

This paper introduces Incentive Method to handle both hard and soft constraints in an evolutionary algorithm for solving some multi-constraint optimization problems. The Incentive Method uses hard and soft constraints to help allocating heuristic search effort more effectively. The main idea is to modify the objective fitness function by awarding differential incentives according to the defined qualitative preferences, to solution sets which are divided by their satisfaction to constraints. It does not exclude the right to access search spaces that violate some or even all constraints. We test this technique through its application on generating solutions for a classic infinite-horizon extensive-form game. It is solved by an Evolutionary Algorithm incorporated by Incentive method. Experimental results are compared with results from a penalty method and from a non-constraint setting. Statistic analysis suggests that Incentive Method is more effective than the other two techniques for this specific problem.


Iterative Filter Generation using Genetic Programming
Segond, Marc, Robilliard, Denis, and Fonlupt, Cyril

Oceanographers from the IFREMER institute have an hypothesis that the presence of so-called retentive meso-scale vortices in ocean and coastal waters could have an influence on watery fauna's demography. Up to now, identification of retentive hydro-dynamical structures on stream maps has been performed by experts using background knowledge about the area. We tackle this task with filters induced by Genetic Programming, a technique that has already been successfully used in pattern matching problems. To overcome specific difficulties associated with this problem, we introduce a refined scheme that iterates the filters classification phase while giving them access to a memory of their previous decisions. These iterative filters achieve superior results and are compared to a set of other methods.


Iterative Prototype Optimisation with Evolved Improvement Steps
Kubalik, Jiri and Faigl, Jan

Evolutionary algorithms have already been more or less successfully applied to a wide range of optimisation problems. Typically, they are used to evolve a population of complete candidate solutions to a given problem, which can be further refined by some problem-specific heuristic algorithm. In this paper, we introduce a new framework called {\it Iterative Prototype Optimisation with Evolved Improvement Steps}. This is a general optimisation framework, where an initial prototype solution is being improved iteration by iteration. In each iteration, a sequence of actions/operations, which improves the current prototype the most, is found by an evolutionary algorithm. The proposed algorithm has been tested on problems from two different optimisation problem domains - binary string optimisation and the traveling salesman problem. Results show that the concept can be used to solve hard problems of big size reliably achieving comparably good or better results than classical evolutionary algorithms and other selected methods.


Learning Recursive Functions with Object Oriented Genetic Programming
Agapitos, Alexandros and Lucas, Simon M.
(Nominated for Best Paper Award)

This paper describes the evolution of recursive functions within an Object-Oriented Genetic Programming (OOGP) system. We evolved general solutions to factorial, Fibonacci, exponentiation, even-n-Parity, and nth-3. We report the computational effort required to evolve these methods and provide a comparison between crossover and mutation variation operators, and also undirected random search. We found that the evolutionary algorithms performed much better than undirected random search, and that mutation outperformed crossover on most problems.


Negative Slope Coefficient. A Measure to Characterize Genetic Programming Fitness Landscapes
Vanneschi, Leonardo, Tomassini, Marco, Collard, Philippe, and Verel, Sebastien

Negative slope coefficient has been recently introduced and empirically proven a suitable hardness indicator for some well known genetic programming benchmarks, such as the even parity problem, the binomial-3 and the artificial ant on the Santa Fe trail. Nevertheless, the original definition of this measure contains several limitations. This paper points out some of those limitations, presents a new and more relevant definition of the negative slope coefficient and empirically shows the suitability of this new definition as a hardness measure for some genetic programming benchmarks, including the multiplexer, the intertwined spirals problem and the royal trees.


Population Clustering in Genetic Programming
Xie, Huayang, Zhang, Mengjie, and Andreae, Peter

This paper proposes an approach to reducing the cost of fitness evaluation whilst improving the effectiveness in Genetic Programming (GP). In our approach, the whole population is first clustered by a heuristic called fitness-case-equivalence. Then a cluster representative is selected for each cluster. The fitness value of the representative is calculated on all training cases. The fitness is then directly assigned to other members in the same cluster. Subsequently, a clustering tournament selection method replaces the standard tournament selection method. A series of experiments were conducted to solve a symbolic regression problem, a binary classification problem, and a multi-class classification problem. The experiment results show that the new GP system significantly outperforms the standard GP system on these problems.


Projecting Financial Data using Genetic Programming in Classification and Regression Tasks
Est\'ebanez, C\'esar, Valls, Jos\'e M., and Aler, Ricardo

The use of Constructive Induction (CI) methods for the generation of high-quality attributes is a very important issue in Machine Learning. In this paper, we present a CI method based in Genetic Programming (GP). This method is able to evolve projections that transform the dataset, constructing a new coordinates space in which the data can be more easily predicted. This coordinates space can be smaller than the original one, achieving two main goals at the same time: on one hand, improving classification tasks; on the other hand, reducing dimensionality of the problem. Also, our method can handle classification and regression problems. We have tested our approach in two financial prediction problems because their high dimensionality is very appropriate for our method. In the first one, GP is used to tackle prediction of bankruptcy of companies (classification problem). In the second one, an IPO Underpricing prediction domain (a classical regression problem) is confronted. Our method obtained in both cases competitive results and, in addition, it drastically reduced dimensionality of the problem.


Solving Sudoku with the GAuGE System
Nicolau, Miguel and Ryan, Conor

This paper presents an evolutionary approach to solving Sudoku puzzles. Sudoku is an interesting problem because it is a challenging logical puzzle that has previously only been solved by computers using various brute force methods, but it is also an abstract form of a timetabling problem, and is scalably difficult. A different take on the problem, motivated by the desire to be able to generalise it, is presented. The GAuGE system was applied to the problem, and the results obtained show that its mapping process is well suited for this class of problems.


The Halting Probability in von Neumann Architectures
Langdon, W. B. and Poli, Riccardo

Theoretical models of Turing complete linear genetic programming (GP) programs suggest the fraction of halting programs is vanishingly small. Convergence results proved for an idealised machine, are tested on a small T7 computer with (finite) memory, conditional branches and jumps. Simulations confirm Turing complete fitness landscapes of this type hold at most a vanishingly small fraction of usable solutions.


Using Subtree Crossover Distance to Investigate Genetic Programming Dynamics
Vanneschi, Leonardo, Gustafson, Steven, and Mauri, Giancarlo
(Nominated for Best Paper Award)

To analyse various properties of the search process of genetic programming it is useful to quantify the distance between two individuals. Using operator-based distance measures can make this analysis more accurate and reliable than using distance measures which have no relationship with the genetic operators. This paper extends a recent definition of a distance measure based on subtree crossover for genetic programming. Empirical studies are presented that show the suitability of this measure to dynamically calculate the fitness distance correlation coefficient during the evolution, to construct a fitness sharing system for genetic programming and to measure genotypic diversity in the population. These experiments confirm the accuracy of the new measure and its consistency with the subtree crossover genetic operator.


Characterizing Diversity in Genetic Programming
Wyns, Bart, De Bruyne, Peter, and Boullart, Luc

In many evolutionary algorithms candidate solutions run the risk of getting stuck in local optima after a few generations of optimization. In this paper two improved approaches to measure population diversity are proposed and validated using two traditional test problems in genetic programming literature. Code growth gave rise to improve pseudo-isomorph measures by eliminating non-functional code using an expression simplifier. Also, Rosca's entropy to measure behavioral diversity is updated to cope with problems producing a more continuous fitness value. Results show a relevant improvement with regard to the original diversity measures.


Complexity and Cartesian Genetic Programming
Woodward, John Robert

Genetic Programming (GP) \cite{banzhaf:1997:book} often uses a tree form of a graph to represent solutions. An extension to this representation, Automatically Defined Functions (ADFs) is to allow the ability to express modules. In Woodward we proved that the complexity of a function is independent of the primitive set (function set and terminal set) if the representation has the ability to express modules. This is essentially due to the fact that if a representation can express modules, then it can effectively define its own primitives at a constant cost. Cartesian Genetic Programming (CGP) is a relative new type of representation used in Evolutionary Computation (EC), and differs from the tree based representation in that outputs from previous computations can be reused. This is achieved by representing programs as directed acyclic graphs (DAGs), rather than as trees. Thus computations from subtrees can be reused to reduce the complexity of a function. We prove an analogous result to that in Woodward the complexity of a function using a (Cartesian Program) CP representation is independent of the terminal set (up to an additive constant), provided the different terminal sets can both be simulated. This is essentially due to the fact that if a representation can express Automatic Reused Outputs \cite{miller:2000:CGP}, then it can effectively define its own terminals at a constant cost.


Design of Robust Communication Systems Using Genetic Algorithms
Ou, Chien-Min, Hwang, Wen-Jyi, and Yung, Hung-Chuan

This paper presents a novel genetic algorithm for jointly optimizing source and channel codes. The algorithm uses a channel-optimized vector quantizer for the source code, and a rate-punctured convolutional code for the channel code. The genetic algorithm enhances the robustness of the rate-distortion performance of the channel-optimized vector quantizer, and reduces the computational time for finding the best rate-punctured convolutional code. Numerical results show that the algorithm attains near optimal performance while having low computational complexity.


Developmental Evaluation in Genetic Programming: the Preliminary Results
McKay, Robert Ian, Hoang, Tuan Hao, Essam, Daryl Leslie, and Nguyen, Xuan Hoai

Abstract. This paper investigates developmental evaluation in Genetic Pro-gramming (GP). Extant GP systems, including developmental GP systems, typically exhibit modular and hierarchical structure only to the degree it is built-in by the designer; by contrast, biological systems exhibit a high degree of or-ganization in their genotypes. We hypothesise that even when GP systems are subject to changing environments, for which the adaptability arising from modular structure would be advantageous, the benefit is at the species rather than individual level, so that selection is very weak. By contrast, biological sys-tems are selected repeatedly throughout their development process. We suggest that this difference is crucial; that if an individual is evaluated multiple times throughout its development, then modular structure can provide an adaptive ad-vantage to that individual, and hence can be selected for by evolution. We in-vestigate this hypothesis using Tree Adjoining Grammar Guided Genetic Pro-gramming (TAG3P), which has good properties for supporting evaluation during incremental development. Our preliminary results show that develop-mental TAG3P outperforms both original TAG3P and standard tree-based GP on an appropriate problem, in ways which suggest that modular solutions may have been developed.


Evolving noisy oscillatory dynamics in genetic regulatory networks
Leier, Andr\'{e}, Kuo, P. Dwight, Banzhaf, Wolfgang, and Burrage, Kevin

We introduce a genetic programming (GP) approach for evolving genetic networks that demonstrate desired dynamics when simulated as a discrete stochastic process. Our representation of genetic networks is based on a biochemical reaction model including key elements such as transcription, translation and post-translational modifications. The stochastic, reaction-based GP system is similar but not identical with algorithmic chemistries. We evolved genetic networks with noisy oscillatory dynamics. The results show the practicality of evolving particular dynamics in gene regulatory networks when modelled with intrinsic noise.


Information-Dependent Switching of Identification Criteria in a Genetic Programming System for System Identification
Buchsbaum, Thomas and V\"ossner, Siegfried

Genetic Programming (GP) can be used to identify the nonlinear differential equations of dynamical systems. If, however, the fitness function is chosen in a classical way, the optimization will not work very well. In this article, we explain the reasons for the failure of the GP approach and present a solution strategy for improving performance. Using more than one identification criterion (fitness function) and switching based on the information content of the data enable standard GP algorithms to find better solutions in shorter times. A computational example illustrates that identification criteria switching has a bigger influence on the results than the choice of the GP parameters has.


Invariance of Function Complexity under Primitive Recursive Functions.
Woodward, John Robert

Genetic Programming (GP) \cite{banzhaf:1997:book} often uses a tree form of a graph to represent solutions. An extension to this representation, Automatically Defined Functions (ADFs) is to allow the ability to express modules. In Woodward we proved that the complexity of a function is independent of the primitive set (function set and terminal set) if the representation has the ability to express modules. This is essentially due to the fact that if a representation can express modules, then it can effectively define its own primitives at a constant cost. This is reminiscent of the result that the complexity of a bit string is independent of the choice of Universal Turing Machine (UTM) (within an additive constant), the constant depending on the UTM but not on the function. The representations typically used in GP are not capable of expressing recursion, however a few researchers have introduced recursion into their representations. These representations are then capable of expressing a wider classes of functions, for example the primitive recursive functions (PRFs). We prove that given two representations which express the PRFs (and only the PRFs), the complexity of a function with respect to either of these representations is invariant within an additive constant. This is in the same vein as the proof of the invariants of Kolmogorov complexity and the proof in Woodward.


On the Locality of Grammatical Evolution
Rothlauf, Franz and Oetzel, Marie

This paper investigates the locality of the genotype-phenotype mapping (representation) used in grammatical evolution (GE). The results show that the representation used in GE has problems with locality as many neighboring genotypes do not correspond to neighboring phenotypes. Experiments with a simple local search strategy reveal that the GE representation leads to lower performance for mutation-based search approaches in comparison to standard GP representations. The results suggest that locality issues should be considered for further development of the representation used in GE.


Optimizing the Initialization of Dynamic Decision Heuristics in DPLL SAT Solvers Using Genetic Programming
Kibria, Raihan H. and Li, You

The Boolean satisfiability problem (SAT) has many applications in electronic design automation (EDA) as well as theoretical computer science. Most SAT solvers for EDA problems use the DPLL algorithm and conflict analysis dependent decision heuristics. When the search starts, the heuristics have little or no information about the structure of the CNF. In this work, an algorithm for initializing dynamic decision heuristics is evolved using genetic programming. The open-source SAT solver MiniSAT v1.12 is used. Using the best algorithm evolved, an advantage was found for solving unsatisfiable EDA SAT problems.


P-CAGE: An Environment for Evolutionary Computation in Peer-to-Peer Systems
Folino, Gianluigi and Spezzano, Giandomenico

Solving complex real-world problems using evolutionary computation is a CPU time-consuming task that requires a large amount of computational resources. Peer-to-Peer (P2P) computing has recently revealed as a powerful way to harness these resources and efficiently deal with such problems. In this paper, we present a P2P implementation of Genetic Programming based on the JXTA technology. To run genetic programs we use a distributed environment based on a hybrid multi-island model that combines the island model with the cellular model. Each island adopts a cellular genetic programming model and the migration occurs among neighboring peers. The implementation is based on a virtual ring topology. Three different termination criteria (effort, time and max-gen) have been implemented. Experiments on some popular benchmarks show that the approach presents a accuracy at least comparable with classical distributed models, retaining the obvious advantages in terms of decentralization, fault tolerance and scalability of P2P systems.


Positional Independence and Recombination in Cartesian Genetic Programming
Cai, Xinye, Smith, Stephen L., and Tyrrell, Andy M.

Previously, recombination (or crossover) has proved to be unbeneficial in Cartesian Genetic Programming (CGP). This paper describes the implementation of an implicit context representation for CGP in which the specific location of genes within the chromosome has no direct or indirect influence on the phenotype. Consequently, recombination has a beneficial effect and is shown to outperform conventional CGP in the even-3 parity problem.