Monday 8 April 2019

GAs a Solution Solution

No, not a typo. GAs have typically been set to work to generate a solution to a fixed problem.

My idea is to create a solution that generates solutions eg.

Using the MNIST dataset of hand written digits, you could breed a network that recognises digits by breeding networks until the connections and weights define a network that can solve the problem and you could optimise it for the least number of connections and nodes. This, however, is not a general solution to building recognition networks. This is very specific to the set of data shown. What would be more useful and interesting would be to define a set of rules that can be applied to build a network based on the data set in real time. A GA solution that can create a network that adapts to the data supplied. Maybe the network could be applied to shape recognition or general hand writing to text conversion.

One of my early areas of interest, was using a GA to create a compression algorithm. I succeeded in creating compressed versions of image files, but each compression was specific to the file and took many generations to breed. A better use of GAs, and similar to that described above, would be to have the GA find a generic compression solution. This would require a data set that consisted of multiple files of different types to compress, and two parts to the process, a compression cycle where the individual can see the source file and develop a compressed data set, and a decompression cycle where the source file is not available and must be reconstructed.

Digital Genes

In living things, genes generate the structures of life using a set of basic building blocks that can be combined and tweaked in multiple ways to create the structures required to allow the creatures survive in the environment.

In GA algorithms, the digital gene sequence needs to be translated into a function that performs successfully in the problem space. The question is, what are the basic building blocks that should be used and how should they be combined and tweaked ?

In living things, the protein is the smallest unit that is combined and tweaked. What is the GA version of this ?

My current thought is that these basic units could be discovered using GAs. By setting the problem environment to a random state, various combinations micro units could be used to create functions, and scored based on the amount of change they cause to the environment state ,either positive or negative, thereby creating a set of building blocks that are not inert (are reactive) when used to try and solve a problem.

The micro units would be functions like boolean conditions, boolean functions, arithmetic operations, storage of values and retrieval.

If this proves unsuccessful, I may try using functions as the GA alternative to proteins.

Wednesday 27 February 2019

Genetic Algorithms as a route to Artificial Intelligence

"Genetic Algorithms as a route to Artificial Intelligence" That was the title of my undergraduate dissertation at the end of my Psychology degree over 25 years ago. It was not my greatest piece of work and did its part in getting me the degree I deserved :-)

Since then, I've occasionally dipped back in to the area of GAs (Genetic Algorithms) to try and improve them and occasionally to solve complex problems. Once you have created all the elements, they almost magically find solutions to complex problems particularly .

Terms:
Environment - The problem space
Individual - A possible solution to be tried in the problem space
vDNA (virtual DNA) - A string of data that defines an individual
Gene - A unit of the data that makes up the vDNA
Transcription - The process where vDNA is used to make an individual
Mutation - A random change made to the vDNA at transcription
Breed - combine elements from two individuals to create a new child individual
Solution Score - A numerical measure of the success of a solution (its health)
Heath Velocity - Rate the populations score is increasing

GAs have several drawbacks that I have tried to address:

- First, they do not work well on solving linear problems where one step must follow another to reach a better solution. My current attempt to solve this is to start the individual at random points in the problem space or to test them with the environment in a random starting states or to test them at random points in a solution where this is possible.

REAL EXAMPLES HERE

- Secondly, mutations can happen anywhere in the sequence of vDNA that describe a possible solution. I am attempting to tackle this by adding an extra variable to each gene in the vDNA that defines its mutability. The mutability value itself can mutate, and I hope that this will enable individuals to evolve with essential elements of their solution protected from mutation.

REAL EXAMPLES HERE

I've also investigated whether you can use a GA to breed a better GA. GAs have a number of variables and processes that it should be possible to optimise eg. mutation rate, breeding selection, crossover rules and variables that could be used to modify these rules such as rate of increase of score (population health velocity).

RESULTS HERE

- Thirdly, the building blocks of the individual constrain the solutions possible. Living organisms use the versatile protein to make all manner of complex solutions to problems, however a GA does not have this constraint, what should be the elements for a virtual protein in a GA ?
Some suggestions:

  • Boolean rules: If then
  • Simple mathematical functions: add, subtract, multiply and divide
  • Iterations: While then
  • Lookup value tables
  • Neural network nodes


It is probable that the optimum set of building blocks to create a solution will depend on the type of problem space. So I am looking at whether you can use a GA to develop a set of optimum building blocks. To do this you would generate and breed small solution units and test them against a problem space. Units would be selected on the basis that they have managed to change the state of the environment in any way (either positively or negatively). Thus you end up with a set of candidate building blocks that interact with the environment and can be used to try to breed a solution.

REAL EXAMPLE HERE

Another issue is that solutions tend to converge on a single path through the problem space and although the individuals are scoring highly they may have reached a dead end (low or zero health velocity). GAs are incapable of reversing from a high scoring solution to find a completely different route.

To address this, I now have a GA architecture with a concept of breeding pools. In each pool a different element of the solution is being bred. The pools feed into each other when certain criteria are met, for instance when the rate of increase of solution score reaches a certain level, the best individual is taken to become a member of a new pool. The original pool is reset and searches for an optimum solution again. The new pool waits for a population of high scoring individuals to be bread and then uses these as the staring population for a new process of breeding an optimal solution.

Further to this, I have been looking at how you can accelerate GAs by breeding populations of small partial solution solvers that can collaborate to find an overall solution. Questions to answer are:

  • What is the smallest solution solver that can be used ?
  • When should one of these individuals act ?
  • How do you score an individual, allocating a score that relates to its part in an overall solution ?
  • When can you remove an individual from the population ?
  • What are the rules for when and where a new individual should be placed in the population ?

An experiment I have not yet attempted is building a Lamarck Algorithm. Although Lamarckism has been demonstrated not to happen in the real world, there is no reason it could not be implemented in a virtual environment. it would be interesting to see if it accelerates finding a solution or if there is some underlying mathematical reason it does not occur naturally. An extension to this would be an individual that never dies but continues to evolve in some way without having to create children to improve its solution. There does seem to be some similarity between such a system and neural networks, both simulated and in the real world.




Tuesday 24 July 2018

CADA - Civil Autonomous Driving Authority

With the imminent introduction of autonomous driving vehicles to our roads, accidents are bound to happen, some due to AI error, some caused by human error and some unavoidable due to other environmental events like deer or children running into the road etc.

It cannot be left to manufacturers of the vehicles, the judiciary or individual governments to manage the investigations into such events.

When aircraft crash the CAA investigates. We should have a Civil Autonomous Driving Authority CADA.

AV = Autonomous Vehicle
Green Box = A recording of the environment in which the vehicle driving. This would include visual recordings of the surroundings, car data such as speed and position and any further environment data available such as time, temperature etc.
White Box = AV system enclosed to protect proprietary code but with common input and output so that it can be plugged into a vehicle simulator.

CADA would make available a set of test scenarios that AVs must pass to be licensed.
All AVs to have black boxes to record the last journey made.

When an accident occurs, a set of steps would have to be followed:

  • A black box is supplied to CADA
  • A simulation of the accident is created using black box data
  • A white box is used to drive a vehicle in the simulator
  • Simulations are used to update scenarios for future AVs and current AV updates
  • Scenarios are made open source for use by AV developers
  • AV licensing fees used to pay for CADA


Problems
Who pays - The vehicle manufacturers would have to pay to licence a new EV
How to handle new sensor technology - The green box would have to be able to take
How to enable competition by giving low market entry cost - Access to all scenarios, the simulator, white box and green box specifications and data would be made available for free.

Wednesday 26 July 2017

Pondering P6

In the list of points in my previous post was point P6

p6. Current models start with random sets of connections to neurons. This seems odd. Wouldn't a more efficient starting point be to add neurons with connections matching patterns as they occur during network development.

I've played around with a couple of small models that attempt to place nodes (neuron models) to recognise patterns that occur at an occurrence rate above that expected in a random input signal. It has showed some promise and I shall probably take it further given time.

The implementation so far uses an array of nodes that keep track of occurrence rates of sets of inputs, but these are not weighted connection nodes like the normal neuron modelled nodes in a neural net, they are generated by occurring pattern within a domain of inputs and then produce an output when that pattern occurs a set level above chance. The output from these nodes can then be used by other nodes to find combinations of patterns that occur above chance and to place neuron nodes and connections in places where pattern recognition should occur.

Where as nodes in a neural net have some basis in neuroscience, I could not see a basis for this new type of node until I did some further reading. I was taught that neurons where the data processors in the brain, held in place by glial cells that form the scaffolding for the cortex. However these cells also play a role in directing the placement and connection of new neurons and connections. Maybe these cells could be performing the process that my pattern tracking nodes are doing.

Of particular interest are Astrocytes and Radial glia.

Monday 10 July 2017

Please accept my confession. Neural Nets and me.

Please accept my confession and absolve me of my sins, it has been four years since my last post.

My current area of interest is Neural Nets and I've been playing...

My first adventures in AI started with my first computer, the ZX Spectrum. I developed simple programs a bit like Amazon's Alexa that were simple question and answer applications with words extracted from sentences and substituted into set replies. I dreamt of a day when I could hold a conversation with my computer in natural language. It appears it is quite a hard nut to crack :-)

I've been interested and played with Neural Nets since I was 17, which is quite a long time ago. My first attempt was building an electronic analogue circuit as my A level electronics project. It was too ambitious for both my skills and my budget, but it did look cool and complicated. The examining board were not so easily impressed. I should have stuck to a motor controller.

I then went to University to study Physics and Psychology but soon realised that Physics was not for me and that Psychology might be the place to find some answers to the problem of how to build good AI. It was, but I was also expected to learn about a lot of things that I had no interest in so it did not go as well as it could have. My final thesis was a poor attempt at showing how genetic algorithms could be used to evolve AI systems (I did learn from this that University cleaners will unplug your computer if the socket is needed for the vacuum cleaner even in a computing lab, and that a Z88 computer does not have the power to run a decent genetic algorithm in useful amount of time).But enough of blaming my tools, I hope that now my skills and knowledge have grown, and I can show that GAs can be used to create useful Neural Nets.

After a few different fill in jobs I ended up working as a software developer and that where I have stayed for 25 years. I have had plenty of time to hone my skills and increase my understanding of how the brain might work. I've followed developments that have changed the playing field considerably, from MRI scanners that remove the reliance on head injuries to investigate brain processing, to the cloud computing revolution making super computing power within the grasp of the amateur.

Neural Networks are fashionable and the increased speed and size of computers have made large networks feasible even for tinkerers like myself, so I have decided to return to the area that first interested me all those years ago.

I've always been most interested in AGI (Artificial General Intelligence) and particularly learning without an external teacher. The current large scale networks relate little to what we know about how the brain learns. They require thousands of items of training data tagged with the information within them that we wish the network to recognise.

Below are the current areas I am working on broken into three sections. The main problems that I feel need to be overcome, the elements of human cognition that have not yet been addressed and the experiments I am running to fix the problems and address the issues.

Problems that need to be overcome

p1. We, as humans, do not need thousands of instances of a patterns to group them. Often just a couple are enough. Evidence:

p2. We do not need to be told that patterns belong to a particular group to place them in groups. Evidence:

p3. We can recognise visual patterns as being the same no matter where in the visual field the appear or which orientation they are presented in. Evidence:

p4. There is not overseer that can monitor, place and alter neurons individually. Neurons and their connections are the only micro control mechanism. Chemical transmitters can send macro signals to and from other areas of the body (including brain). Neurons and those they are connected to are on their own as far as micro decisions go.

p5. Back propagation, used to train networks by altering connection strengths, require a teaching signal to provide a score based on neuron responses. It is similar to behaviourism in psychology. It only takes you so far, at some point you have to supply a trainer system.

p6. Current models start with random sets of connections to neurons. This seems odd. Wouldn't a more efficient starting point be to add neurons with connections matching patterns as they occur during network development.

Elements that have not been addressed

e1. We, and therefore our brains, have evolved...

e2. Humans go through stages of learning that have evolved. Evidence:

  • Visual system learns to recognise edges after birth and before a set age.
  • The kindergarten effect
e3. We apply our current internal model onto input from the external real world. Evidence:
  • The Swiss Cheese illusion
  • The Spotty Dog illusion
  • The speech without consonants illusion

e4. We have layers of neurons in the cortex. More than one and less than those in a deep neural net.

e5. Areas of the brain seem to have specialised functions. Evidence:
  • Face recognition

e6. Some elements of behaviour have evolved, while others are learnt, yet others we have probably evolved to learn. How do these two elements interact to create a network.

 My Current Experiments

Ex1. Learning pattern recognition without access to the training labels.
See p1,2,3,5,6

In this experiment, I am building a NN system that learns to differentiate handwritten digits from 0 to 9 from a data set called MNist but without the use of the labels provided.
The network builds itself to differentiate the different elements of the patterns in the set.
My theory is that given the correct set of simple rules, a network can be built that will have distinct nodes (neurons) that fire when, and only when, a specific digit is displayed but can generalise to all examples of that digit.
Whereas current networks are given the training labels (eg. given an image of the digit 8 and told it is a digit 8), this network will only be given the digit 8 and not given the label. Once a level of training (or in this case experiencing) has elapsed, the nodes are searched for ones that only fire for a specific digit and connected to output nodes. This differs from current networks in that we do not pre-define where the recogniser for a specific pattern (digit) should occur, but let it develop and then at a later time associate the recogniser with an output using a function called MagiMax. Another difference is that we do not start with an initial layer of nodes with large numbers of connections with random weights, but rather build the nodes and layers based on patterns submitted to the network.

The network built is of multiple layers from an input layer (layer 0 that is pre built to match the 28x28 matrix used in the MDist data set) to an output layer (5). Nodes in a layer only connect to a limited domain of nodes in the layer above. The training data set is applied once as each layer is built, so layer 1 is built based on a full training set, then layer 2 is built by processing a full data set etc.

Initial simple applications of this process gave a number of good recognisers for some digits, but others failed to appear. The networks ran out of space for new nodes.

Ex2. Evolving solutions to problems using neural net populations
See e1,6

This a repeat of previous experiments to evolve a network of nodes, connections and weights to recognise MNist data set digits. This is really just to check that the implementation of the GA and NN work.

Evolution stalled on an above chance but poor recognition score.

Ex3. Evolving builders of neural nets to solve problems
See e1,6

Currently the work on this problem is concentrated on building a set of elements that can be randomly arranged to build networks that can then be scored on their ability to build networks.

The elements need to be resilient to crossover and mutation, so that breeding new genomes gives a viable network builder in most instances.

Ex4. Building matrix transformations based on experience to solve spacial, 2d rotation and 3d rotation of patterns for recognition.
See p3

Work on this is still in the planning phase. The idea is that by presenting moving patterns to the input of a visual recognition network, the network can arrange itself into a configuration where new patterns can be recognised independently of where they appear in the field of vision and in any orientation.
It is possible that the result of Ex3 may be used to try and evolve a solution to this problem.

Ex5. Path finder nodes

An investigation into whether a network of nodes can be used to keep track of patterns that have happened frequently with a view to then placing nodes and connections where patterns occur significantly often.








Thursday 4 April 2013

Crazy Idea of the day

Having been reading about Google's Omega project on wired, I was thinking about how I could build a rival super computer using my meager resources. How could I get my hands on thousands of processors to perform large parallel computations.
How about using browsers as nodes on a web wide computer.

  1. A visitor to my server opens a page and confirms that they are will to take part in the process.
  2. The web page polls the server for tasks.
  3. When a set of tasks are available, the tasks are shared by the server among polling browsers.
  4. The tasks are sent as small JavaScript programs to run.
  5. The results are sent back to the server for collation.
The web page would display details of the tasks being run.
The service could be offered as a peer to peer service, with a web interface offering the ability to upload jobs to be run.