Regularization in Neural Networks, help needed

19 views (last 30 days)
Hello. I seem to have an issue with getting regularization to work with the Neural Network toolbox for a classification problem.
I am using R2011b. here is the area of code I'm using to train the NN:
% training-----------------------------------------------------------------
% Create a Pattern Recognition Network
fprintf('Training Neural Net\n');
hiddenLayerSize = [300];
net = patternnet(hiddenLayerSize);
% Setup Division of Data for Training, Validation, Testing
net.divideParam.trainRatio = 70/100;
net.divideParam.valRatio = 15/100;
net.divideParam.testRatio = 15/100;
net.inputs{1}.processFcns = {};
net.outputs{2}.processFcns = {};
net.performFcn='msereg';
net.performParam.ratio=0.5;
%net.trainParam.max_fail = 20;
[net,tr] = train(net,data',labels');
save 'net.mat' net;
nnoutputs = net(double(data'));
figure; plotconfusion(labels',nnoutputs);
fprintf('Training complete\n');
So far the network works and I'm getting good classification accuracy both the test set during training, and another test set that I retest the network on afterwards. Note that I have removed the preprocessing and postprocessing functions.
However, I want to visualize which features my network is actually using. In my features, I have added one feature that is just random noise, and one that is random noise divided by 100 to see how the other features compare to it.
When I look at the input weights of my NN, the weights connected to the random noise/100 feature is huge, like several times bigger than any other weight. If regularization was working, this shouldn't happen right?
In my code, I have set performFcn to msereg. However, in online code examples i see that their trainFcn is not transcg (which is what my net.trainFcn seems to be using) but it is instead something else like trainbr. I can't use trainbr since I run out of RAM. I am getting this from this site: http://www.mathworks.com/support/solutions/en/data/1-17WXC/?solution=1-17WXC
I have tried changing net.performParam.ratio to 0.01 and to 1. It doesn't seem to change the performance of the network other than the MSE numbers change. I have also plotted a bar graph of the sum of the absolute value of the weights connected to each feature between the input and hidden layer (net.iw), and this bar graph does not change much when changing the ratio (ie: the random noise weight is still gigantic). When I set the ratio to 0 so that the classification accuracy becomes horrible, the shape of this bar graph remains the same, which is also puzzling.
I would also like to mention that the random noise weight that was not divided by 100 has weights similar in magnitude to my features. Only the one divided by 100 seems to be inflated, which leads me to believe that the weights grew in size to magnify or scale the feature up. However, since regularization is "on", this shouldn't happen since the random noise feature adds no useful data. Both the random noise and random noise/100 weights should be pushed to 0 (I think).
Some additional information I'd like to mention is that the network trains in about 55 epoches and stops due to an increase in MSE from the validation set, so the early stopping seems to be working fine.
Can anyone bring some insight as to what is happening here? Can I not use msereg with trainscg?

Accepted Answer

Greg Heath
Greg Heath on 28 Jul 2013
The number of hidden nodes of H=300 is more that an order of magnitude off. Try using much smaller values.
For regularization use TRAINBR with defaults.
help trainbr
Note that it does not use a validation set.
To rank the effectiveness of each input just use randperm to scramble it's order and record the resulting increase in mse.
Trying to interpret weights can be misleading and is not highly recommended.
Hope this helps.
Thank you for formally accepting my answer
Greg
  1 Comment
ZY
ZY on 31 Jul 2013
Thanks for the help. I found that the original default trainscg function had a regularization parameter in the paramFcn. It was just set to 0. I tried it on the crab dataset and it seems to work. I'm going to keep testing and try out some of the methods you mentioned too.

Sign in to comment.

More Answers (0)

Categories

Find more on Sequence and Numeric Feature Data Workflows in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!