C++ Neural Networks and Fuzzy Logic:Backpropagation II

C++ Neural Networks and Fuzzy Logic
by Valluru B. Rao
M&T Books, IDG Books Worldwide, Inc.
ISBN: 1558515526 Pub Date: 06/01/95

Table of Contents

To see the outputs of all the patterns, we need to copy the training.dat file to the test.dat file and rerun the simulator in Test mode. Remember to delete the expected output field once you copy the file.

Running the simulator in Test mode (0) shows the following result in the output.dat file:

for input vector:
0.000000  0.000000  1.000000  0.000000  0.000000  0.000000  1.000000
0.000000  1.000000  0.000000  1.000000  0.000000  0.000000  0.000000
1.000000  1.000000  0.000000  0.000000  0.000000  1.000000  1.000000
1.000000  1.000000  1.000000  1.000000  1.000000  0.000000  0.000000
0.000000  1.000000  1.000000  0.000000  0.000000  0.000000  1.000000
output vector is:
0.005010  0.002405  0.000141
-----------
for input vector:
1.000000  0.000000  0.000000  0.000000  1.000000  0.000000  1.000000
0.000000  1.000000  0.000000  0.000000  0.000000  1.000000  0.000000
0.000000  0.000000  0.000000  1.000000  0.000000  0.000000  0.000000
0.000000  1.000000  0.000000  0.000000  0.000000  1.000000  0.000000
1.000000  0.000000  1.000000  0.000000  0.000000  0.000000  1.000000
output vector is:
0.001230  0.997844  0.000663
-----------
for input vector:
1.000000  0.000000  0.000000  0.000000  1.000000  1.000000  0.000000
0.000000  0.000000  1.000000  1.000000  0.000000  0.000000  0.000000
1.000000  1.000000  1.000000  1.000000  1.000000  1.000000  1.000000
0.000000  0.000000  0.000000  1.000000  1.000000  0.000000  0.000000
0.000000  1.000000  1.000000  0.000000  0.000000  0.000000  1.000000
output vector is:
0.995348  0.000253  0.002677
-----------
for input vector:
1.000000  1.000000  1.000000  1.000000  1.000000  1.000000  0.000000
0.000000  0.000000  1.000000  1.000000  0.000000  0.000000  0.000000
1.000000  1.000000  1.000000  1.000000  1.000000  1.000000  1.000000
0.000000  0.000000  0.000000  1.000000  1.000000  0.000000  0.000000
0.000000  1.000000  1.000000  1.000000  1.000000  1.000000  1.000000
output vector is:
0.999966  0.000982  0.997594
-----------
for input vector:
0.000000  0.000000  1.000000  0.000000  0.000000  0.000000  0.000000
1.000000  0.000000  0.000000  0.000000  0.000000  1.000000  0.000000
0.000000  0.000000  0.000000  1.000000  0.000000  0.000000  0.000000
0.000000  1.000000  0.000000  0.000000  0.000000  0.000000  1.000000
0.000000  0.000000  0.000000  0.000000  1.000000  0.000000  0.000000
output vector is:
0.999637  0.998721  0.999330
-----------

The training patterns are learned very well. If a smaller tolerance is used, it would be possible to complete the learning in fewer cycles. What happens if we present a foreign character to the network? Let us create a new test.dat file with two entries for the letters M and J, as follows:

1 0 0 0 1  1 1 0 1 1  1 0 1 0 1  1 0 0 0 1  1 0 0 0 1  1 0 0 0 1
0 0 1 0 0  0 0 1 0 0  0 0 1 0 0  0 0 1 0 0  0 0 1 0 0  0 0 1 0 0

1 0 0 0 1
0 1 1 1 1

The results should show each foreign character in the category closest to it. The middle layer of the network acts as a feature detector. Since we specified five neurons, we have given the network the freedom to define five features in the input training set to use to categorize inputs. The results in the output.dat file are shown as follows.

for input vector:
1.000000  0.000000  0.000000  0.000000  1.000000  1.000000  1.000000
0.000000  1.000000  1.000000  1.000000  0.000000  1.000000  0.000000
1.000000  1.000000  0.000000  0.000000  0.000000  1.000000  1.000000
0.000000  0.000000  0.000000  1.000000  1.000000  0.000000  0.000000
0.000000  1.000000  1.000000  0.000000  0.000000  0.000000  1.000000
output vector is:
0.963513  0.000800  0.001231
-----------
for input vector:
0.000000  0.000000  1.000000  0.000000  0.000000  0.000000  0.000000
1.000000  0.000000  0.000000  0.000000  0.000000  1.000000  0.000000
0.000000  0.000000  0.000000  1.000000  0.000000  0.000000  0.000000
0.000000  1.000000  0.000000  0.000000  0.000000  0.000000  1.000000
0.000000  0.000000  0.000000  1.000000  1.000000  1.000000  1.000000
output vector is:
0.999469  0.996339  0.999157
-----------

In the first pattern, an M is categorized as an H, whereas in the second pattern, a J is categorized as an I, as expected. The case of the first pattern seems reasonable since the H and M share many pixels in common.

Other Experiments to Try

There are many other experiments you could try in order to get a better feel for how to train and use a backpropagation neural network.

• You could use the ASCII 8-bit code to represent each character, and try to train the network. You could also code all of the alphabetic characters and see if it’s possible to distinguish all of them.

• You can garble a character, to see if you still get the correct output.

• You could try changing the size of the middle layer, and see the effect on training time and generalization ability.

• You could change the tolerance setting to see the difference between an overtrained and undertrained network in generalization capability. That is, given a foreign pattern, is the network able to find the closest match and use that particular category, or does it arrive at a new category altogether?

We will return to the same example after enhancing the simulator with momentum and noise addition capability.

Adding the Momentum Term

A simple change to the training law that sometimes results in much faster training is the addition of a momentum term. The training law for backpropagation as implemented in the simulator is:

Weight change = Beta * output_error * input

Now we add a term to the weight change equation as follows:

Weight change = Beta * output_error * input +
                Alpha*previous_weight_change

The second term in this equation is the momentum term. The weight change, in the absence of error, would be a constant multiple by the previous weight change. In other words, the weight change continues in the direction it was heading. The momentum term is an attempt to try to keep the weight change process moving, and thereby not get stuck in local minimas.

Code Changes

The effected files to implement this change are the layer.cpp file, to modify the update_weights() member function of the output_layer class, and the main backprop.cpp file to read in the value for alpha and pass it to the member function. There is some additional storage needed for storing previous weight changes, and this affects the layer.h file. The momentum term could be implemented in two ways:

1. Using the weight change for the previous pattern.

2. Using the weight change accumulated over the previous cycle.

Table of Contents