Monday 8 April 2019

GAs a Solution Solution

No, not a typo. GAs have typically been set to work to generate a solution to a fixed problem.

My idea is to create a solution that generates solutions eg.

Using the MNIST dataset of hand written digits, you could breed a network that recognises digits by breeding networks until the connections and weights define a network that can solve the problem and you could optimise it for the least number of connections and nodes. This, however, is not a general solution to building recognition networks. This is very specific to the set of data shown. What would be more useful and interesting would be to define a set of rules that can be applied to build a network based on the data set in real time. A GA solution that can create a network that adapts to the data supplied. Maybe the network could be applied to shape recognition or general hand writing to text conversion.

One of my early areas of interest, was using a GA to create a compression algorithm. I succeeded in creating compressed versions of image files, but each compression was specific to the file and took many generations to breed. A better use of GAs, and similar to that described above, would be to have the GA find a generic compression solution. This would require a data set that consisted of multiple files of different types to compress, and two parts to the process, a compression cycle where the individual can see the source file and develop a compressed data set, and a decompression cycle where the source file is not available and must be reconstructed.

Digital Genes

In living things, genes generate the structures of life using a set of basic building blocks that can be combined and tweaked in multiple ways to create the structures required to allow the creatures survive in the environment.

In GA algorithms, the digital gene sequence needs to be translated into a function that performs successfully in the problem space. The question is, what are the basic building blocks that should be used and how should they be combined and tweaked ?

In living things, the protein is the smallest unit that is combined and tweaked. What is the GA version of this ?

My current thought is that these basic units could be discovered using GAs. By setting the problem environment to a random state, various combinations micro units could be used to create functions, and scored based on the amount of change they cause to the environment state ,either positive or negative, thereby creating a set of building blocks that are not inert (are reactive) when used to try and solve a problem.

The micro units would be functions like boolean conditions, boolean functions, arithmetic operations, storage of values and retrieval.

If this proves unsuccessful, I may try using functions as the GA alternative to proteins.

Wednesday 27 February 2019

Genetic Algorithms as a route to Artificial Intelligence

"Genetic Algorithms as a route to Artificial Intelligence" That was the title of my undergraduate dissertation at the end of my Psychology degree over 25 years ago. It was not my greatest piece of work and did its part in getting me the degree I deserved :-)

Since then, I've occasionally dipped back in to the area of GAs (Genetic Algorithms) to try and improve them and occasionally to solve complex problems. Once you have created all the elements, they almost magically find solutions to complex problems particularly .

Terms:
Environment - The problem space
Individual - A possible solution to be tried in the problem space
vDNA (virtual DNA) - A string of data that defines an individual
Gene - A unit of the data that makes up the vDNA
Transcription - The process where vDNA is used to make an individual
Mutation - A random change made to the vDNA at transcription
Breed - combine elements from two individuals to create a new child individual
Solution Score - A numerical measure of the success of a solution (its health)
Heath Velocity - Rate the populations score is increasing

GAs have several drawbacks that I have tried to address:

- First, they do not work well on solving linear problems where one step must follow another to reach a better solution. My current attempt to solve this is to start the individual at random points in the problem space or to test them with the environment in a random starting states or to test them at random points in a solution where this is possible.

REAL EXAMPLES HERE

- Secondly, mutations can happen anywhere in the sequence of vDNA that describe a possible solution. I am attempting to tackle this by adding an extra variable to each gene in the vDNA that defines its mutability. The mutability value itself can mutate, and I hope that this will enable individuals to evolve with essential elements of their solution protected from mutation.

REAL EXAMPLES HERE

I've also investigated whether you can use a GA to breed a better GA. GAs have a number of variables and processes that it should be possible to optimise eg. mutation rate, breeding selection, crossover rules and variables that could be used to modify these rules such as rate of increase of score (population health velocity).

RESULTS HERE

- Thirdly, the building blocks of the individual constrain the solutions possible. Living organisms use the versatile protein to make all manner of complex solutions to problems, however a GA does not have this constraint, what should be the elements for a virtual protein in a GA ?
Some suggestions:

  • Boolean rules: If then
  • Simple mathematical functions: add, subtract, multiply and divide
  • Iterations: While then
  • Lookup value tables
  • Neural network nodes


It is probable that the optimum set of building blocks to create a solution will depend on the type of problem space. So I am looking at whether you can use a GA to develop a set of optimum building blocks. To do this you would generate and breed small solution units and test them against a problem space. Units would be selected on the basis that they have managed to change the state of the environment in any way (either positively or negatively). Thus you end up with a set of candidate building blocks that interact with the environment and can be used to try to breed a solution.

REAL EXAMPLE HERE

Another issue is that solutions tend to converge on a single path through the problem space and although the individuals are scoring highly they may have reached a dead end (low or zero health velocity). GAs are incapable of reversing from a high scoring solution to find a completely different route.

To address this, I now have a GA architecture with a concept of breeding pools. In each pool a different element of the solution is being bred. The pools feed into each other when certain criteria are met, for instance when the rate of increase of solution score reaches a certain level, the best individual is taken to become a member of a new pool. The original pool is reset and searches for an optimum solution again. The new pool waits for a population of high scoring individuals to be bread and then uses these as the staring population for a new process of breeding an optimal solution.

Further to this, I have been looking at how you can accelerate GAs by breeding populations of small partial solution solvers that can collaborate to find an overall solution. Questions to answer are:

  • What is the smallest solution solver that can be used ?
  • When should one of these individuals act ?
  • How do you score an individual, allocating a score that relates to its part in an overall solution ?
  • When can you remove an individual from the population ?
  • What are the rules for when and where a new individual should be placed in the population ?

An experiment I have not yet attempted is building a Lamarck Algorithm. Although Lamarckism has been demonstrated not to happen in the real world, there is no reason it could not be implemented in a virtual environment. it would be interesting to see if it accelerates finding a solution or if there is some underlying mathematical reason it does not occur naturally. An extension to this would be an individual that never dies but continues to evolve in some way without having to create children to improve its solution. There does seem to be some similarity between such a system and neural networks, both simulated and in the real world.