Analyzing results and output plots of neural network

17 views (last 30 days)
Hi,
I'm new to neural network and need help , my simple nnet input consists of 15 class , each class has 7 samples i.e 15x7 =105 column vector , each of them has 20 element.
[R,Q1] = size(P); % [20 105]
[SN,Q2] = size(T); % [15 105]
if Q1 ~= Q2
error('Training:invalidTrainingAndDesired', ...
'The number of input vectors and desired ouput do not match');
end
mynet1 = newff(P,T, [20 15], {'tansig' 'tansig' }, 'trainlm');
mynet1.trainParam.epochs = 5000;
mynet1.trainParam.goal = 0.01; %*mean(var(T'))/100 ;
mynet1.performFcn ='mse';
mynet1.trainParam.lr = 0.01;
mynet1.divideFcn = 'dividerand'; %# how to divide data
mynet1.divideParam.trainRatio = 70/100; %# training set
mynet1.divideParam.valRatio = 15/100; %# validation set
mynet1.divideParam.testRatio = 15/100; %# testing set
mynet1.trainParam.show = 100;
mynet1.trainparam.mc = 0.95;
[mynet1,tr,Y,E] = train(mynet1,P,T);
y = sim(mynet1,P);
plotconfusion(T,y);
output =sim(mynet1,P(:,24));
output
1- I want to understand and know about some plots that's generated by neural network like the following image of the regression plot , I think there is something wrong .. can I know that from this plot ? and why the data points shown in the plot are like that ? is it normal or what does that indicate to ?
2 - As for the confusion matrix that's also generated by the nnet , do the percentages calculated in the last row and last column show the recognition rate ? what else can show the recognition rate ?
3 - I can't get the result class from (sim ), The output vector should be 0's and 1 only in the target class , but the values of the output vector that result from the (sim) fuction in mynet1 are real numbers because of tansig transfer function that's applied on my output , so how can I get back its format to get the test result of the nnet in the format I specified in the beginning ?
4- Why mynet1 can't never reach the goal performance (0.01) ? it always reaches around 0.05 ؟
5- When I doubled my dataset , I couldn't get a good result although I tried alot change the number of hidden layers and their neurons account , why ? what's a suitable back propagation training function ? or did I create mynet1 in a wrong way ? How to get best classification result
6- When should I do minmax(P) instaed of inserting P directly?
I hope to hear from someone ASAP . Thank you in advance

Accepted Answer

Greg Heath
Greg Heath on 5 Nov 2012
Edited: Greg Heath on 5 Nov 2012
% Arbitrary data for checking syntax
P = randn(20,105);
T = repmat(eye(15),1,7);
% NAIVE CONSTANT MODEL MSES FOR REFERENCE
MSE00 = mean(var(T',1)) % 0.0622 Biased
MSE00a = mean(var(T') % 0.0628 Unbiased
> mynet1 = newff(P,T, [20 15], {'tansig' 'tansig' }, 'trainlm');
No.
For pattern recognition or classification with c classes it is better to
1. Use columns of eye(c) as targets.
2. Use SOFTMAX (or LOGSIG) as the output activation function
3. Use TRAINSCG as the training function
You can use NEWPR (Pattern Recognition). NEWPR calls NEWFF with training function TRAINSCG and classification specific plot functions. However, it uses TANSIG as the output activation instead of SOFTMAX OR LOGSIG. Therefore you can use
net = newpr(P,T, 20);
net.layers{net.numLayers}.transferFcn = 'softmax'
However, with the same RNG setting, you should be able to get the exact same net using
net = newff(P,T, 20, {'tansig' 'softmax' }, 'trainscg');
net.plotFcns = {'plotperform','plottrainstate','plotconfusion','plotroc'};
> mynet1.trainParam.goal = 0.01; %*mean(var(T'))/100 ;
No.
T = repmat(eye(15),1,7); % Example
mean(var(T'))/100 % = 6.2821e-004
> mynet1.trainParam.epochs = 5000; > mynet1.performFcn ='mse'; > mynet1.trainParam.lr = 0.01; > mynet1.divideFcn = 'dividerand'; %# how to divide data > mynet1.divideParam.trainRatio = 70/100; %# training set > mynet1.divideParam.valRatio = 15/100; %# validation set > mynet1.divideParam.testRatio = 15/100; %# testing set > mynet1.trainParam.show = 100; > mynet1.trainparam.mc = 0.95;
Why not uncomplicate life and just use the defaults ??
> [mynet1,tr,Y,E] = train(mynet1,P,T);
> y = sim(mynet1,P);
The last calculation is unnecessary because y is just the same as Y and E = T-Y
trueclass = vec2ind(T);
assignedclass = vec2ind(y);
Now you can calculate error rates of all classes and the overall error rate.
> plotconfusion(T,y);
> output =sim(mynet1,P(:,24));
> output
> > 1- I want to understand and know about some plots that's generated > by neural network like the following image of the regression plot , I > think there is something wrong .. can I know that from this plot ? and > why the data points shown in the plot are like that ? is it normal or > what does that indicate to ?
Those plots are more appropriate for regression or curve-fitting (e.g., NEWFIT) They do not yield much info for classification. See my code above.
> 2 - As for the confusion matrix that's also generated by the nnet , do > the percentages calculated in the last row and last column show the > recognition rate ? what else can show the recognition rate ?
See the folowing and run the examples
help confusion
doc confusion
help confusionplot
doc confusionplot
> 3 - I can't get the result class from (sim ), The output vector should be > 0's and 1 only in the target class , but the values of the output vector > that result from the (sim) fuction in mynet1 are real numbers because > of tansig transfer function that's applied on my output , so how can I > get back its format to get the test result of the nnet in the format I > specified in the beginning ?
See my code above
> 4- Why mynet1 can't never reach the goal performance (0.01) ? it always > reaches around 0.05 ؟
Make changes and rerun. For each value of hidden nodes, H, run 10 or 20 trials to mitigate the random weight initializtion.
> 5- When I doubled my dataset , I couldn't get a good result although I tried > alot change the number of hidden layers and their neurons account , why ? > what's a suitable back propagation training function ? or did I create mynet1 > in a wrong way ? How to get best classification result
See above
> 6- When should I do minmax(P) instaed of inserting P directly?
Never. That is a more obsolete code than the obsolete code you are using.
With your version do not include the number of output nodes in the function call. It will be automatically obtained from T.
help newff
doc newff
> I hope to hear from someone ASAP . Thank you in advance
Hope this helps.
Thank you for formally accepting my answer.
Greg
  1 Comment
aurora
aurora on 7 Nov 2012
Edited: aurora on 7 Nov 2012
Thank you very much .
As for trainscg I tried it and many other training functions but the best was trainlm .
As for the confusion matrix, I always read the help before I ask but I'm just new in neural network concepts, my question was do the percentages of confusion matrix represent the recognition rate ? or there's another calculations to evaluate the neural network performance?
I've some more question, what's the importance of validation dataset , I know its benefit but when I disabled the mynet1.divideFcn = ''; the training performance increases , so what should I do ?
- What do you mean by biased & unbiased in :
MSE00 = mean(var(T',1)) % 0.0622 Biased
MSE00a = mean(var(T') % 0.0628 Unbiased
- what's the percentage of acceptable training performance ? what about 88% ? the validation and testing performance always less than training one, so when evaluating the recognition rate what values should I use ? my high simulation values or my low testing ones?
- Testing dataset must be untrained , right ?
Thank you
Best Regards

Sign in to comment.

More Answers (1)

Greg Heath
Greg Heath on 9 Nov 2012
Analyzing results and output plots of neural network
Asked by aurora on 4 Nov 2012 at 1:30
Latest activity Commented on by aurora on 7 Nov 2012 at 17:35
% As for trainscg I tried it and many other training functions but the best % was trainlm .
Interesting.
% As for the confusion matrix, I always read the help before I ask but I'm % just new in neural network concepts, my question was do the percentages % of confusion matrix represent the recognition rate ? or there's another % calculations to evaluate the neural network performance?
The code I gave you outputs the indices of the true class and the assigned class.
That is all the info you need.
Just compare and count the errors for each class and how the errors are distributed among the other classes.
Then compare your calculations with the confusion matrix.
The diagonal components are correct regognition rates.
% I've some more question, what's the importance of validation dataset ,
Use of a validation set during training mitigates the folly of minimizing training set error at the expense of increasing error on nontraining data. Search
Early Stopping and Stopped Training in the comp.ai.neural-nets FAQ and archived posts.
Validation data is also used to rank multiple designs. Ideally, in this case, either the validation data is not used for Early Stopping or the validation data is split into an Early Stopping subset and a ranking subset. Nevertheless, it is not uncommon to see one set used for both.
% I know its benefit but when I disabled the mynet1.divideFcn = ''; the % training performance increases , so what should I do ?
Plot your training, validation and testing subsets to make sure that they can be consideed to come from the same overall probability distribution. Sometimes just comparing means and variances are sufficient Sometimes a reshuffling of the data makes sense.
Then trust your nontraining data performance estimates.
% - What do you mean by biased & unbiased in :
% MSE00 = mean(var(T',1)) % 0.0622 Biased % % MSE00a = mean(var(T') % 0.0628 Unbiased
I won't answer that. Instead I'll ask you a question:
When calculating variance do you divide by N or N-1. Why?
help var
doc var
% - what's the percentage of acceptable training performance ? %what about 88% ?
It depends on the data.
I typically design 100 or more nets using, for example, 10 different choices for number of hidden nodes and 10 different random weight initializations for each value of H.
... Readily accomplished in a double for loop.
% the validation and testing performance always less than training % one, so when evaluating the recognition rate what values should I use ? % my high simulation values or my low testing ones?
Training data for weight estimation
Validation data for Early Stopping and/or multiple design ranking
Test data to estimate the performance on unseen data.(AKA "Generalization Error")
If you were buying a $100,000 classifier from another company, which error rate would interest you more?
% - Testing dataset must be untrained , right ?
Yes. How else can you get an honest (unbiased) estimate of performance on unseen data?
A common practice is to use 10-fold crossvalidation.
Search in the CANN FAQ and archives for details.
In my posts I typically use the abbreviation XVAL.
Hope this helps.
Greg

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!