Simply put, Neural Network is a Math function of nonlinear result. That is, it gets a set of values and interpolate the next value in a prediction manner. For this to be possible, it is trained with a set of values and historical data already gathered. The values of the Neural Net are compared to the training set and adjusted accordingly until it gets the same values of the training set. This way the Neural Net can “guess” what the next value would be.
The inner workings of a Neural Net are the Input, the Hidden Layer and the Output. Inside the Hidden Layer are the Neurons and Dendrites. In short:
- Input is normalized in x;
- x is multiplied by a Weight(Dendrite);
- All x*weight are added by the Neuron in Sum;
- The Sum is put in a Sigmoid function and returns a Neuron Value;
- The Neuron Value is sent through another layer or the Output;
Just as the name says, it gets the input from the world and sent it through the neural network. It’s however very important to normalize the values otherwise you couldn’t train the neural network. For that we could use a simple math function that is expressed like so:
This way we preserve the fluctuation and give a more palatable number to the Neural Net to digest, like in the example below:
The line that connects one Neuron to another is the Weight. This is simply a number that the input is multiplied by. However it’s an important part, since the intelligence is all here. This weight number changes to learn and is carried through the iterations of training.
x = Input * Weight
When the weights are first created, it’s important to make them as random as possible inside a range to avoid creating Nets that are too similar. I found out that a range between -4 and 4 is a good range to start.
Sometimes the Neural Net get stuck and can’t solve a problem past a certain point. I discovery that, if you increase your weights range from one generation to another, it improves the speed. So, after 100 generations (or iterations or epochs) you can increase the range from -4 and 4 to -5 and 5.
Hidden Layers (Neurons)
The Neurons are made of 2 important parts: Sum and Sigmoid. Each layers is made of an amount of neurons. The layers are usually 1 or 2 and there are very rare problems that can’t be solved in this manner.
Number of neurons
There is not a definitive rule as how many Neurons should we put on each layer. There are however some rules of thumb that can be used as required:
- Should be between the size of the input layer and the size of the output layer;
- Neurons (In+Out)/2;
- Should be 2/3 the size of the input layer plus the size of the output layer;
- Neurons = (In*0.6)+Out;
- Should be less than twice the size of the input layer;
- Neurons < In*2;
Once the Neuron receive the numbers from all the Dendrites, that is, after the input is multiplied by the weight, its is all added in a variable. This number is then sent to the sigmoid function.
x = Sum([Input1*W1]+[Input2*W2]+…[Inputn*Wn])
It’s the activation function.
It’s a simple function that translates the input in a wave. It goes like this:
- Math: 1 / (1 + e-x);
- Explain: 1 divided by the sum of 1 plus Euler(e) power by the negative input(x);
- C#: return 1 / (1 + Math.Exp(-x));
x = 1/(1+e-Input)
There are many algorithms like Back Propagation, Resilient, Newton’s method, Quasi Newton, Levenberg Marquardt, just to name a few. Most of training methods use pure math to calibrate the weights. The method that I present below, Genetic Algorithm, is a little different and use the sole power of many fast iterations to reach a satisfactory result.
This method simply converts the weights into “genes” and iterates the fittest populations until it reaches the desired result.
- Create a random population;
- Get the best nets accordingly to the fitness;
- Create the next generation based on the fittest;
- Introduce a small percentage of mutation in the new generation;
- Evaluate the best net on this generation;
- If the best fitness score is not good enough go back to Step 2;
This calculates how far a given Neural Net is from a desired result (supervised training). That is, given some parameters(input) it has to reach an expected result (output).
So, for each collection of inputs in data list, you should calculate how far each output is from the expected result. I use the follow calculation:
The square of absolute difference between Output and Expected. Then Sum all and divide it by the quantity of outputs to get the average.
Sum( (Abs(Output Value)-Abs(Expected Value))² )
/ Output Count
*Abs: absolute value, non-negative
The new neural nets will be generated based on the fittest of the past generation. It’s important to include these parents too in the new generation to avoid going backwards in fitness.
These fittest can amount for up to 20% of your population to as few as a pair.
With the list of the fittest, you then create a new neural net geting the first weight based on a random parent, after that the second weight based on a random parent and so on. Do this until you have an entire new population of neural nets.
Whenever you are geting the weights for the next generation, you should introduce a small chance to not get a weight from any parent and generate a random weight, a mutation!
This values should not be so small that it doesn’t bring changes to the population, nor it should be so high that it doesn’t advance the fitness of the population because of too much divergence from the parents. A value between 0.5% to 2.0% is usually ideal.
Neural Networks have many diverse applications and it’s becoming ubiquitous on IT. There are many frameworks and tools that already implement them, but it’s important to know how it operates.
Despite the ominous name, a “Neural” Network is not sentient. It’s a math tool designed to solve specific problems. It powers automation and repetitive and (somewhat) dumb tasks like comparing images to sort what is a bee and what is a three. But it could also be used to predict trends in a chart or save lives by making a more accurate medical diagnostics.
Neural Networks have come to stay and is one more increment in thousands already made to make our lives easier and better.
A workable version in C# of a Neural Network described on this article is in this link. I hope you enjoy!
Cesar Ottani, programmer since 2001, Data Processing degree on FATEC São Paulo. Interested in game development and automation. Also a Rock drummer when possible.
My wife Vick for the dedication and patience.
My brother Celio that always pair with me in my endeavours.
My friend Waz that revised all the text.