Monday, 10 July 2017

Please accept my confession. Neural Nets and me.

Please accept my confession and absolve me of my sins, it has been four years since my last post.

My current area of interest is Neural Nets and I've been playing...

My first adventures in AI started with my first computer, the ZX Spectrum. I developed simple programs a bit like Amazon's Alexa that were simple question and answer applications with words extracted from sentences and substituted into set replies. I dreamt of a day when I could hold a conversation with my computer in natural language. It appears it is quite a hard nut to crack :-)

I've been interested and played with Neural Nets since I was 17, which is quite a long time ago. My first attempt was building an electronic analogue circuit as my A level electronics project. It was too ambitious for both my skills and my budget, but it did look cool and complicated. The examining board were not so easily impressed. I should have stuck to a motor controller.

I then went to University to study Physics and Psychology but soon realised that Physics was not for me and that Psychology might be the place to find some answers to the problem of how to build good AI. It was, but I was also expected to learn about a lot of things that I had no interest in so it did not go as well as it could have. My final thesis was a poor attempt at showing how genetic algorithms could be used to evolve AI systems (I did learn from this that University cleaners will unplug your computer if the socket is needed for the vacuum cleaner even in a computing lab, and that a Z88 computer does not have the power to run a decent genetic algorithm in useful amount of time).But enough of blaming my tools, I hope that now my skills and knowledge have grown, and I can show that GAs can be used to create useful Neural Nets.

After a few different fill in jobs I ended up working as a software developer and that where I have stayed for 25 years. I have had plenty of time to hone my skills and increase my understanding of how the brain might work. I've followed developments that have changed the playing field considerably, from MRI scanners that remove the reliance on head injuries to investigate brain processing, to the cloud computing revolution making super computing power within the grasp of the amateur.

Neural Networks are fashionable and the increased speed and size of computers have made large networks feasible even for tinkerers like myself, so I have decided to return to the area that first interested me all those years ago.

I've always been most interested in AGI (Artificial General Intelligence) and particularly learning without an external teacher. The current large scale networks relate little to what we know about how the brain learns. They require thousands of items of training data tagged with the information within them that we wish the network to recognise.

Below are the current areas I am working on broken into three sections. The main problems that I feel need to be overcome, the elements of human cognition that have not yet been addressed and the experiments I am running to fix the problems and address the issues.

Problems that need to be overcome

p1. We, as humans, do not need thousands of instances of a patterns to group them. Often just a couple are enough. Evidence:

p2. We do not need to be told that patterns belong to a particular group to place them in groups. Evidence:

p3. We can recognise visual patterns as being the same no matter where in the visual field the appear or which orientation they are presented in. Evidence:

p4. There is not overseer that can monitor, place and alter neurons individually. Neurons and their connections are the only micro control mechanism. Chemical transmitters can send macro signals to and from other areas of the body (including brain). Neurons and those they are connected to are on their own as far as micro decisions go.

p5. Back propagation, used to train networks by altering connection strengths, require a teaching signal to provide a score based on neuron responses. It is similar to behaviourism in psychology. It only takes you so far, at some point you have to supply a trainer system.

p6. Current models start with random sets of connections to neurons. This seems odd. Wouldn't a more efficient starting point be to add neurons with connections matching patterns as they occur during network development.

Elements that have not been addressed

e1. We, and therefore our brains, have evolved...

e2. Humans go through stages of learning that have evolved. Evidence:

  • Visual system learns to recognise edges after birth and before a set age.
  • The kindergarten effect
e3. We apply our current internal model onto input from the external real world. Evidence:
  • The Swiss Cheese illusion
  • The Spotty Dog illusion
  • The speech without consonants illusion

e4. We have layers of neurons in the cortex. More than one and less than those in a deep neural net.

e5. Areas of the brain seem to have specialised functions. Evidence:
  • Face recognition

e6. Some elements of behaviour have evolved, while others are learnt, yet others we have probably evolved to learn. How do these two elements interact to create a network.

 My Current Experiments

Ex1. Learning pattern recognition without access to the training labels.
See p1,2,3,5,6

In this experiment, I am building a NN system that learns to differentiate handwritten digits from 0 to 9 from a data set called MNist but without the use of the labels provided.
The network builds itself to differentiate the different elements of the patterns in the set.
My theory is that given the correct set of simple rules, a network can be built that will have distinct nodes (neurons) that fire when, and only when, a specific digit is displayed but can generalise to all examples of that digit.
Whereas current networks are given the training labels (eg. given an image of the digit 8 and told it is a digit 8), this network will only be given the digit 8 and not given the label. Once a level of training (or in this case experiencing) has elapsed, the nodes are searched for ones that only fire for a specific digit and connected to output nodes. This differs from current networks in that we do not pre-define where the recogniser for a specific pattern (digit) should occur, but let it develop and then at a later time associate the recogniser with an output using a function called MagiMax. Another difference is that we do not start with an initial layer of nodes with large numbers of connections with random weights, but rather build the nodes and layers based on patterns submitted to the network.

The network built is of multiple layers from an input layer (layer 0 that is pre built to match the 28x28 matrix used in the MDist data set) to an output layer (5). Nodes in a layer only connect to a limited domain of nodes in the layer above. The training data set is applied once as each layer is built, so layer 1 is built based on a full training set, then layer 2 is built by processing a full data set etc.

Initial simple applications of this process gave a number of good recognisers for some digits, but others failed to appear. The networks ran out of space for new nodes.

Ex2. Evolving solutions to problems using neural net populations
See e1,6

This a repeat of previous experiments to evolve a network of nodes, connections and weights to recognise MNist data set digits. This is really just to check that the implementation of the GA and NN work.

Evolution stalled on an above chance but poor recognition score.

Ex3. Evolving builders of neural nets to solve problems
See e1,6

Currently the work on this problem is concentrated on building a set of elements that can be randomly arranged to build networks that can then be scored on their ability to build networks.

The elements need to be resilient to crossover and mutation, so that breeding new genomes gives a viable network builder in most instances.

Ex4. Building matrix transformations based on experience to solve spacial, 2d rotation and 3d rotation of patterns for recognition.
See p3

Work on this is still in the planning phase. The idea is that by presenting moving patterns to the input of a visual recognition network, the network can arrange itself into a configuration where new patterns can be recognised independently of where they appear in the field of vision and in any orientation.
It is possible that the result of Ex3 may be used to try and evolve a solution to this problem.

Ex5. Path finder nodes

An investigation into whether a network of nodes can be used to keep track of patterns that have happened frequently with a view to then placing nodes and connections where patterns occur significantly often.








No comments:

Post a Comment