Python 2.7: Setting up Neural Network with PyBrain

Today, I’m experimenting machine learning concepts in python. For this purchase I’m using PyBrain. If you would like to have a better idea about Python, I suggest having a quick glance at posts 1-10 in Python category.

PyBrain is a Machine Learning library for Python. PyBrain stands for Python-Based Reinforcemnet Learning, Artificial Intelligence and Neural Network Library.

For complete guide on installation you can get complete details from:

http://pybrain.org/docs/

In the example below, based on pybrain.org tutorial I’m creating a network, dataset and training my network on the dataset.

Installing PyBrain:

$ git clone git://github.com/pybrain/pybrain.git $ python setup.py install

Building a network

Let’s build a simple network. Assume our network accepts 2 inputs, and is expected to generate 2 outputs. Let’s experiment with the following network structure:

one input layer (2 neurons)
one hidden layer (3 neurons)
one output layer (1 neuron)

from pybrain.tools.shortcuts import buildNetwork
net = buildNetwork(2, 3, 1);

Activating Network

Network gets populated with random values. We can test the output of the network by activating it.

Let’s pass the inputs 2 and 3 to our network:

print net.activate(\[2, 3\])

Customizing your network structure

You can check out the network structure, using the following print commands:

from pybrain.tools.shortcuts import buildNetwork

# building network
net = buildNetwork(2, 3, 1);

# activating network on input 2, 3
print net.activate(\[2, 3\]);

# will display the network structure
print net
"""
output:
FeedForwardNetwork-8
   Modules:
    \[, , , \]
   Connections:
    \[ 'out'>,  'hidden0'>,  'out'>,  'hidden0'>\]
"""

# output: 
print net\['in'\]

# output: 
print net\['out'\]

# output: 
print net\['hidden0'\]

When using buildNetwork the hidden layer is constructed with a sigmoid squashing function. Let’s assume you would like to change the hidden layer to a different type of layer, i.e. Hyperbolic Tangent function. You can do so, by supplying the hidden layer class as an argument to buildNetwork constructor:

from pybrain.tools.shortcuts import buildNetwork
from pybrain.structure import TanhLayer
from pybrain.structure import SoftmaxLayer

# the hidden layer of network 1 is constructed
# with Hyperbolic Tangent activation function
net1 = buildNetwork(2, 3, 1, hiddenclass = TanhLayer)

# the hidden layer of network 2 is constructed
# with Softmax activation function
net2 = buildNetwork(2, 3, 1, hiddenclass = TanhLayer, outclass = SoftmaxLayer)

# network is using bias
net3 = buildNetwork(2, 3, 1, bias = True)

### Building a DataSet

SupervisedDataSet - this class is used for standard supervised learning. Supports input and target values whose size is defined.

from pybrain.datasets import SupervisedDataSet
# dataset supports 2-d input and 1-d target
ds = SupervisedDataSet(2, 1)


# data set for the XOR function
ds.addSample((0, 0), (0))
ds.addSample((0, 1), (1))
ds.addSample((1, 0), (1))
ds.addSample((1, 1), (0))

# will print dataset length
# output: 4
print len(ds)

# iterating through dataset like a dictionary
for input, target in ds:
    print input, target

# will print inputs of data set
print ds\['input'\]

# will print targets of dataset
print ds\['target'\]

if you want to clear the dataset, you can use:

\# clear the data set
ds.clear();

Training the Neural Network on the Dataset

Now that we’ve got the network ready it’s time to train our network. We will do so by using back propagation algorithm - BackpropTrainer class.

All we have to do is to provide our network instance and dataset instance to the trainer - instantiated from BackpropTrainer class, and then run the train method.

from pybrain.supervised.trainers import BackpropTrainer
from pybrain.tools.shortcuts import buildNetwork
from pybrain.datasets import SupervisedDataSet
from pybrain.structure import TanhLayer

# building network to be trained on XOR output
net = buildNetwork(2, 3, 1, bias = True, hiddenclass = TanhLayer)

# dataset supports 2-d input and 1-d target
ds = SupervisedDataSet(2, 1)

# data set for the XOR function
ds.addSample((0, 0), (0))
ds.addSample((0, 1), (1))
ds.addSample((1, 0), (1))
ds.addSample((1, 1), (0))

trainer = BackpropTrainer(net, ds)

# this will train the network for full epoch and return
# double which is proportional to the error.
print trainer.train()

# will continue training until results converge
# returns a list of tuples containing the errors for every training epoch
print trainer.trainUntilConvergence()

The example above is used just to show different functions and usages of these functions and won’t lead to effective results. The problem with the above example is that trainUntilConvergence by default requires validationProportion. The default value is 0.25, meaning that 25% of dataset will be used for the validation dataset. These two datasets are split and don’t intersect. The problem is that omitting 25% of dataset samples will lead to a badly trained network.

You can solve XOR problem using different approach:

\# learn XOR with a nerual network with saving of the learned paramaters

import pybrain
from pybrain.datasets import \*
from pybrain.tools.shortcuts import buildNetwork
from pybrain.supervised.trainers import BackpropTrainer

ds = SupervisedDataSet(2, 1)
ds.addSample((0, 0), (0,))
ds.addSample((0, 1), (1,))
ds.addSample((1, 0), (1,))
ds.addSample((1, 1), (0,))

net = buildNetwork(2, 4, 1, bias=True)

trainer = BackpropTrainer(net, learningrate=0.01, momentum=0.99)
trainer.trainOnDataset(ds, 1000)
trainer.testOnData()

print net.activate((1, 1))

As you notice we are trainig our network explicitly for 1000 epochs and not until the results converge with expected output. You can validate the results of the network by rounding the output:

\# output: 0.0
# output: 1.0
# output: 1.0
# output: 0.0

print round(net.activate((0, 0)))
print round(net.activate((0, 1)))
print round(net.activate((1, 0)))
print round(net.activate((1, 1)))

Saving the neural network to file and loading it

After searching the internet for XOR pybrain example I found some reliable code, which actually doesn’t ommit the solution space as mentioned in the example above. As there’s no validation set, the results cannot be tested for convergence, and we have to train for certain number of epochs. Eventually the results did converge, based on evidence.

https://github.com/thedanschmidt/PyBrain-Examples/blob/master/xor.py

\# learn XOR with a nerual network with saving of the learned paramaters

import pybrain
from pybrain.datasets import \*
from pybrain.tools.shortcuts import buildNetwork
from pybrain.supervised.trainers import BackpropTrainer
import pickle

if \_\_name\_\_ == "\_\_main\_\_":
    ds = SupervisedDataSet(2, 1)
    ds.addSample((0, 0), (0,))
    ds.addSample((0, 1), (1,))
    ds.addSample((1, 0), (1,))
    ds.addSample((1, 1), (0,))

    net = buildNetwork(2, 4, 1, bias=True)

    try:
        f = open('\_learned', 'r')
        net = pickle.load(f)
        f.close()
    except:
        trainer = BackpropTrainer(net, learningrate=0.01, momentum=0.99)
        trainer.trainOnDataset(ds, 1000)
        trainer.testOnData()
        f = open('\_learned', 'w')
        pickle.dump(net, f)
        f.close()

    print net.activate((1, 1))

The beauty of the code above is that the first time you run it, the file _learned does not exists and the code will jump to except clause, which will train the network and popualate the _learned file. Afterwards we will activate the network with certain input in the main scope.

In any subsequent execution of the code, _learned file will be opened successfully and the training part will be skipped. We will be able to reuse our network directry from the file!

Note the usage of pickle library which supports dumping / loading objects from / to the file.