Using pattern neural network's weights in my own forward propagation code

3 views (last 30 days)
Hello,
I have a problem with using the weights of a pattern net. What I'm doing is training the net with my own inputs and targets and after that I am testing the net with a different input. Until then everything is OK, in other words, I'm getting the correct answer from the net. Hoewever what I want to do is to use the net's input weights and layer weights as parameters for my implementation of the forward propagation (just by making dotproducts). However, while coding I noticed the pattern net doesn't use the common sigmoidal function, but it uses the tansig() function instead and also the layers have some properties of matlab. Well, finally I coded the forward propagation just like this:
function [ o ] = fwprop(input, IW, b1, LW, b2)
a = dotprod(IW,input);
a = netsum({a, b1});
a = tansig(a);
o = dotprod(LW,a);
o = netsum({o, b2});
o = tansig(o);
end
where:
input: test input. IW: Input Weights. LW: Layer Weights. b1: bias vector 1. b2: bias vector 2.
I'm using dotproduct and netsum because I noticed that the pattern net uses exactly that functions. So, even though I'm using that functions, I continue getting the same wrong results. I wonder if there are some modification in the way that MATLAB computes the forward propagation.
Thanks in advance.
  1 Comment
Greg Heath
Greg Heath on 8 Jul 2013
Never use the same variable on both sides of an equation. It is VERY CONFUSING: The question what are a and o depends on what part of the program you are dealing with ... a VERY dangerous practice for several obvious reasons: old and/or long code understanding and modifications.

Sign in to comment.

Accepted Answer

Greg Heath
Greg Heath on 8 Jul 2013
ALWAYS use TANSIG for hidden layers and normalize inputs so that the means are approximately zero. Let the levels be controlled by b1. I favor zero-mean/unit-variance inputs via mapstd or zscore BEFORE calling CONFIGURE or TRAIN. This not only prevents the learning from being dominated by low significance inputs with large magnitudes (possibly, via sigmoid saturation), it also allows easy recognition of outliers which may require additional preprocessing (e.g., truncation or deletion).
For pattern recognition the target matrix should contain unit vector columns with 1 unity component and the rest zeros. VEC2IND and IND2VEC allow easy transformation to and from the integer class indices. The corresponding output transfer function should be PURELIN, LOGSIG or SOFTMAX. This allows the outputs to be interpreted as consistent estimates of posterior probabilities, conditional on the input, even though only SOFTMAX constrains the outputs to sum to 1.
However, MATLAB defaults are different.
1. There is no checking for outliers.
2. Using the default command
net = patternnet % NO SEMICOLON!
will list all of the net default properties.
3. In particular
inputprocessfunctions = net.inputs{1}.processFcns
outputprocessfunctions = net.outputs{2}.processFcns
% ans = 'removeconstantrows' 'mapminmax'
removes constant variables and maps inputs and outputs to the closed range [-1 -1 ]
also
trainfunction = net.trainFcn
% ans = 'trainscg'
(which I know works well with the unscaled integer targets of zeros and ones) probably works well with the scaled integer targets of minus ones and plus ones.
in addition
layer1transferfunction = net.layers{1}.transferFcn
layer2transferfunction = net.layers{2}.transferFcn
%ans = tansig
which is fine for the [-1 -1] default scaled output.
Your code does not allow for these defaults.
There are several ways to go. You could override all of the MATLAB defaults that are not compatible with my two paragraph beginning explanation. However, the best thing to do is try to accomodate the MATLAB defaults as much as possible.
My suggestion:
1. If there is a possibility of outliers
a. Use ZSCORE to standardize inputs and check for outliers using MINMAX. Truncate or delete outliers depending on your particular problem. Constant rows with zero variance will be converted to rows of zeros.
b. Once the outlier question is resolved you can
i. Either keep the standardized variables
ii. Or transform back to the original variables.
2. Convert outputs to the unit column format (help ind2vec)
3. Initialize the RNG in case you want to duplicate the following runs
4. Use PATTERNNET with defaults.
5. If you are going to design multiple nets in a loop over random initial weights and/or include an outer loop to search multiple candidates for the best choice for number of hidden nodes,
a. Save the initial state of the RNG before each design.
b. Initialize the weights using CONFIGURE before using TRAIN
6. If you are performing classification or pattern recognition, the nets are ranked by error rate, NOT the mse performance function!
RNGstate(i,j) = rng;
net = configure(net,input,target);
[ net tr output ] = train(net,input,target);
trueclass = vec2ind(target);
N =length(trueclass)
assignedclass = vec2ind(output);
Nerr = sum(assignedclass~=trueclass);
PctErr(i,j) = 100*Nerr/N;
8. If you need a breakdown of train, val, and test errors for each class, use the training record tr. See the properties of tr via the command
tr = tr
9. Once a design is chosen, rerun and save
a. The input and output settings from using mapminmax on the
training data
b. The weights
10. To use the weights with analytic formulas instead of the net and/or sim function:
a. Normalize inputs and outputs using the mapminmax settings of the training data.
b. Use the double default tansig formula to get normalized outputs from normalized inputs
c. Use the output settings to get the unnormalized output
d. Compare with the original obtained from the net.
11. First try to understand by using the default value for number of hidden nodes and only one choice of initial weights. Regardless of whether it is a good design or not, compare the two methods. Once they match, you can search for the best design using a double loop.
Hope this helps.
Thank you for formally accepting my answer
Greg

More Answers (1)

Clayder Gonzalez Cadenillas
Thanks Greg,
I was forgetting to use the mapminmax function in my forward propagation. That function has to be called in the beginning with the 'apply' parameter and at the end with the 'reverse' parameter. Now my values are the same than MATLAB implementation.
Thanks again.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!