Chapter 11
Introducing the Genetic Algorithm
The magical algorithm
It was not until the 1980's that science realised the full subtlety of nature's evolutionary selection process. The break through came when a researcher at MIT, John Holland, began to question the direction that main stream AI (artificial intelligence) was taking and began to look at biological systems for a new solution. His insight was to realise that the creation of human intelligence hadn't been based upon any planned system of organisation. It had resulted through the process of evolution: a blind process of trial and error.
This non planning process had created millions of different organisms, whose complexity far exceeded anything that had ever been devised by human intelligence. It had also resulted in the creation of the human brain: the source of the intelligence that everyone was trying to emulate with computer systems.
The enigma John Holland faced was that the probability of objects as complex as biological organisms in general and the human brain in particular being produced by chance were astronomically large. The mathematics would seem to suggest that it would take longer than the age of the universe to produce even the simplest of organisms, let alone human beings. Yet, it had happened, so there had to be something going on, some mysterious mechanism, that had yet to be explained. John Holland spent five years finding out what that mechanism was.
After studying what was then known about evolutionary biology, he reduced the process down to an abstract form that he could model on a computer. He created simple organisms where the genes were represented by sequences of binaries (noughts and ones). He then created a hypothetical perfect organism (consisting of a random combination of these noughts and ones). With his model - a computer program which he called a Genetic Algorithm - he experimented to see how he could get organisms, consisting of different randomly selected sequences of noughts and ones, to evolve into this perfect form through a process of blind chance.
This can be likened to throwing 25 dice to create a hypothetical perfect organism. Then, working out a way for that same sequence to come up again by throwing the dice in different ways. Mathematically, it seems impossible, but nature was doing it and John Holland was trying to see how nature was performing this impossible trick.
John Holland designed his computer program to emulate the process of biological evolution as closely as possible. This required working with an expanding population that could be periodically culled to select the best for survival and allow the worst to die. The selection process was simply a method of choosing which of the pseudo organisms came nearest to his designated perfect specimen. Those selected in this way were then arranged to breed.
The breeding process consisted of putting the selected survivors into pairs; then making a randomly chosen division in each pair and swapping over the genes at that division. Figure 11.1 illustrates one pair of organisms creating two offspring by each splitting at point 5.
Figure 11.1
Breeding process used in a genetic algorithm. Each pair can be broken at several different places to produce different pairs of offspring. This shows two offspring being created by swapping genes at point 5
This division and gene swapping is repeated for each pair several times, with a different division of the genes chosen randomly at each recombination. This creates the new generation of organisms which will then consist of different mixes of the gene combinations that proved most successful in the previous generation (Note: mysteriously, the odd random change to one or other of the genes during the exchanges - mutations - greatly improves the efficiency of the process).
This pairing and swapping of strings of ones and noughts may seem arcane, but, it exactly simulates the mating of individual organisms in a population where the most successful of a generation breed offspring for the next generation. It emulates what appears to happen in most kinds of mating, where groups of male and female genes randomly combine to create the offspring. This same process is then repeated with the offspring. These are tested for their nearness to the "perfect organism" and the selected individual are again paired and arranged to swap genes to produce a new generation. This can then be repeated over any number of generations
John Holland soon found that this incredibly simple process caused the "organisms" to rapidly evolve towards the exact combination of noughts and ones that had been chosen as the sequence for the "perfect organism". Magically, the impossibility - of a long random sequence rapidly turning into a specific sequence - was now not only a possibility, it was demonstrable.
This was a revelation. This computer simulation of the mating procedures used by nature was effectively providing a speedy way to find optimum combinations of large numbers of variables. More discoveries were to come. It was found that not only could a combination of independent variable be quickly and efficiently found, the process could be used where there were dependent variables (where the value of one variable would have an influence on the value of another). These genetic algorithms could even discover the rules that linked the values of dependent variables. Previously, this kind of problem solving had proved intractable, even when using the most sophisticated of mathematics.
Within a very short time of John Holland publishing his research results, his technique of optimisation (the Genetic Algorithm) was finding applications in all kinds of technical areas, where previously the right combination of variables had proven too complex to find by rational analysis. Today, Genetic Algorithms are essential tools for many scientific and business applications and are even built into conventional spread sheet programs.
To summarise: the technique discovered by John Holland can solve problems where there are large numbers of unknowns. Isn't this what creating e-business solutions is all about?