Self Pruning Neural Networks is one step towards creating a Neural Network Topology from actually learning to include or exclude any given weight.
Thereby giving a connection matrix optimized for the task at hand.
But wait this is just one part of the goal to arrive at a
Convolutional Neural Network topology from a Multi Layer Perceptron by
using optimization strategies. The first being SGD - Stochastic Gradient descent -
The same optimizer as used to optimize the weights themselves.
Creating a CNN like topology from an All connected MLP layer is ambitious. After all nature
had evolution to differentiate cells into this most optimal solution for vision to be possible.
Might it be possible to encourage this by learning the same affordances found in nature.
Edge detection could in theory be better done with a CNN using this feature , orientation,
curves, depth. If the network could learn its own center-surround CNN-like topology then we would have done something worthwhile!
SGD can be applied to thresholds for every weight. If the weight exceeds its threshold it is switched on if not it is switched off - Simple.
threshold[i][j] -- Thresholds
prethresh[i][j] -- Pre Thresholds
dltthresh[i][j] -- Deltas
(NB When learning our new thresholds use a different Learning Rate - lower than the one for our Weights)
(NBB You can learn your thresholds using the same error for that layer in the same cycle as the weights)
But is this enough to produce an CNN connection matrix - CNN's differ from MLP's in that they use a fixed number of weights applied to a specific region of the input space - convoluting the image using the same weights. That means weight sharing. Neuron 1...n uses the same k number of weights.
To do weight sharing I have used the thresholds again. Put simply if a weight is above the threshold of another then they are equalised.
This has the effect of bringing each weight down to the level of the lowest wgt for that threshold. Its one solution for weight sharing. There will be others.
Ofcourse SGD is but one strategy for creating optimized connections between our layers, another is the Genetic Algorithms. We have already explored their use here to evolve topologys for an MLP to solve the X-or problem.