Feed-Forward Networks

by Nikolai V. Shokhirev

Up Algorithms

Brief Introduction

There are a lot of good books on Artificial Neural Networks as well as the information on the Internet (see e.g. the references below). I am not going to compete with them. Here I present a simple derivation of back propagation algorithm. 

Definitions

An artificial neural network (ANN) is an interconnected group of artificial neurons:

 

Fig. 1. Neuron

A neuron performs a mathematical transformation of several inputs into one output.

Neurons are usually arranged into layers 

 

Fig. 2. Layer

Several layers form a neural network:

 

Fig. 3. Network

Mathematically, the n-th neuron in l-th layer is described by the following equations

     (1a)

   (2a)

Here w (weights) and b (biases or offsets) are the parameters and F is an activation or transfer function.  

 

Alternative formulation 

Both weights and biases can be combined into one matrix if the 0-th component is added to input vectors

     (3)

 and the weight matrix is extended as follows

     (4)

 With these definitions (1a-2a) become

     (1b)
   (2b)

Both variants can be found in literature. The variant (b) makes formula manipulations rather easier. 

 

 ANN as a non-linear filter

The whole network (Fig. 3.) is just a non-linear transformation (filter), which maps the input space ( X0 = I ) to the output space ( XL = O ). 

     (5)

 In more practical terms neural networks are non-linear modeling tools. They can be used to model complex relationships between inputs and outputs or to find patterns in data.

 

Supervised learning

To be able to find patterns, the network parameters ( w and b ) should be adjusted. Ideally, the outputs Ok for each reference input Ik should match the corresponding target values Tk (k = 1, . . . , K - the number of training patterns). Practically this is the best fit in the least squares sense: 

     (6)

Here E is the total (weighted) error,  ωk are the weights, and P = { w, b }.

 

Backpropagation Algorithm

Eq. (7) is a typical optimization problem. Gradient Descent is a general framework for solving optimization problems (not the best one, by the way). The parameters are adjusted in the direction of the steepest descent (opposite to a gradient):

     (7)

In application to ANNs the Gradient Descent is called the Error Backpropagation Algorithm. This somewhat misleading term is just a recursive method for calculation of the derivatives plus the adjustment (7).

Taking into account the specific form for the error function (6) we get

     (8)

All we need, are the derivatives of x(L) with respect to { w, b }.

 

Derivation

Let us calculate a variance (differential) of (1b):

     (9)

Here we used the only basic fact from the calculus:

     (10)

It can be further expanded using (2b)

     (11)

Important: According to the definitions (1, 2) x(l-1) does not depend on w(l), w(l+1), . . ., w(L) (this is obvious, but confusing for less experienced in mathematics). This means that  

     (12)

In particular, for l = L we can omit the second term in (11) and  

     (13)

This is the initial step of a recursion.

Now we can set  dwnm(L) to zero and take in (11) one step down from l = L to l = L - 1 and again apply Eq. (11). Similar to (13) it gives   

     (14)

and for l = L - 2

     (15)

Now we can substitute the derivatives into Eq. (8) which gives the Error Backpropagation Algorithm

Working expressions 

To be continued

     (11)

In

Appendix

A popular choice for the activation function F is

     (3)

 

References

  1. An Introduction to Neural Networks by Leslie Smith.
  2. NEURAL NETWORKS by Christos Stergiou and Dimitrios Siganos 
  3. Neural Networks at your Fingertips by Karsten Kutza 
  4. Artificial neural network From Wikipedia, http://en.wikipedia.org/wiki/Artificial_neural_network 
  5. Neural Nets by Kevin Gurney
  6. Backpropagation From Wikipedia, http://en.wikipedia.org/wiki/Backpropagation 
  7. Joone - Java Object Oriented Neural Engine, Paolo Marrone and the Joone team
  8. Fast Artificial Neural Network Library (FANN), http://leenissen.dk/fann/

Downloads

Up Algorithms

Home | Resumé |  Shokhirev.com |  Computing |  Links Publications

©Nikolai V. Shokhirev, 2004-2006