Analyzing results and output plots of neural network

Question

0 votes

Hi,

I'm new to neural network and need help , my simple nnet input consists of 15 class , each class has 7 samples i.e 15x7 =105 column vector , each of them has 20 element.

[R,Q1] = size(P);  % [20 105]
[SN,Q2] = size(T); % [15 105]
if Q1 ~= Q2 
    error('Training:invalidTrainingAndDesired', ...
          'The number of input vectors and desired ouput do not match');
end
mynet1 = newff(P,T, [20 15], {'tansig' 'tansig' }, 'trainlm');
mynet1.trainParam.epochs = 5000;
mynet1.trainParam.goal        = 0.01; %*mean(var(T'))/100 ;
mynet1.performFcn ='mse';
mynet1.trainParam.lr          = 0.01;
mynet1.divideFcn = 'dividerand';        %# how to divide data
mynet1.divideParam.trainRatio = 70/100; %# training set
mynet1.divideParam.valRatio = 15/100;   %# validation set
mynet1.divideParam.testRatio = 15/100;  %# testing set
mynet1.trainParam.show        = 100;
mynet1.trainparam.mc          = 0.95;
[mynet1,tr,Y,E] = train(mynet1,P,T);
y = sim(mynet1,P);
plotconfusion(T,y);
output =sim(mynet1,P(:,24));
output

1- I want to understand and know about some plots that's generated by neural network like the following image of the regression plot , I think there is something wrong .. can I know that from this plot ? and why the data points shown in the plot are like that ? is it normal or what does that indicate to ?

2 - As for the confusion matrix that's also generated by the nnet , do the percentages calculated in the last row and last column show the recognition rate ? what else can show the recognition rate ?

3 - I can't get the result class from (sim ), The output vector should be 0's and 1 only in the target class , but the values of the output vector that result from the (sim) fuction in mynet1 are real numbers because of tansig transfer function that's applied on my output , so how can I get back its format to get the test result of the nnet in the format I specified in the beginning ?

4- Why mynet1 can't never reach the goal performance (0.01) ? it always reaches around 0.05 ؟

5- When I doubled my dataset , I couldn't get a good result although I tried alot change the number of hidden layers and their neurons account , why ? what's a suitable back propagation training function ? or did I create mynet1 in a wrong way ? How to get best classification result

6- When should I do minmax(P) instaed of inserting P directly?

I hope to hear from someone ASAP . Thank you in advance

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

Sign in to answer this question.

Follow Question

Answer 1

Greg Heath on 5 Nov 2012

Edited: Greg Heath on 5 Nov 2012

0 votes

% Arbitrary data for checking syntax

P = randn(20,105);

T = repmat(eye(15),1,7);

% NAIVE CONSTANT MODEL MSES FOR REFERENCE

MSE00 = mean(var(T',1)) % 0.0622 Biased

MSE00a = mean(var(T') % 0.0628 Unbiased

> mynet1 = newff(P,T, [20 15], {'tansig' 'tansig' }, 'trainlm');

No.

For pattern recognition or classification with c classes it is better to

1. Use columns of eye(c) as targets.

2. Use SOFTMAX (or LOGSIG) as the output activation function

3. Use TRAINSCG as the training function

You can use NEWPR (Pattern Recognition). NEWPR calls NEWFF with training function TRAINSCG and classification specific plot functions. However, it uses TANSIG as the output activation instead of SOFTMAX OR LOGSIG. Therefore you can use

net = newpr(P,T, 20);

net.layers{net.numLayers}.transferFcn = 'softmax'

However, with the same RNG setting, you should be able to get the exact same net using

net = newff(P,T, 20, {'tansig' 'softmax' }, 'trainscg');

net.plotFcns = {'plotperform','plottrainstate','plotconfusion','plotroc'};

> mynet1.trainParam.goal = 0.01; %*mean(var(T'))/100 ;

No.

T = repmat(eye(15),1,7); % Example

mean(var(T'))/100 % = 6.2821e-004

> mynet1.trainParam.epochs = 5000; > mynet1.performFcn ='mse'; > mynet1.trainParam.lr = 0.01; > mynet1.divideFcn = 'dividerand'; %# how to divide data > mynet1.divideParam.trainRatio = 70/100; %# training set > mynet1.divideParam.valRatio = 15/100; %# validation set > mynet1.divideParam.testRatio = 15/100; %# testing set > mynet1.trainParam.show = 100; > mynet1.trainparam.mc = 0.95;

Why not uncomplicate life and just use the defaults ??

> [mynet1,tr,Y,E] = train(mynet1,P,T);

> y = sim(mynet1,P);

The last calculation is unnecessary because y is just the same as Y and E = T-Y

trueclass = vec2ind(T);

assignedclass = vec2ind(y);

Now you can calculate error rates of all classes and the overall error rate.

> plotconfusion(T,y);

> output =sim(mynet1,P(:,24));

> output

> > 1- I want to understand and know about some plots that's generated > by neural network like the following image of the regression plot , I > think there is something wrong .. can I know that from this plot ? and > why the data points shown in the plot are like that ? is it normal or > what does that indicate to ?

Those plots are more appropriate for regression or curve-fitting (e.g., NEWFIT) They do not yield much info for classification. See my code above.

> 2 - As for the confusion matrix that's also generated by the nnet , do > the percentages calculated in the last row and last column show the > recognition rate ? what else can show the recognition rate ?

See the folowing and run the examples

help confusion

doc confusion

help confusionplot

doc confusionplot

> 3 - I can't get the result class from (sim ), The output vector should be > 0's and 1 only in the target class , but the values of the output vector > that result from the (sim) fuction in mynet1 are real numbers because > of tansig transfer function that's applied on my output , so how can I > get back its format to get the test result of the nnet in the format I > specified in the beginning ?

See my code above

> 4- Why mynet1 can't never reach the goal performance (0.01) ? it always > reaches around 0.05 ؟

Make changes and rerun. For each value of hidden nodes, H, run 10 or 20 trials to mitigate the random weight initializtion.

> 5- When I doubled my dataset , I couldn't get a good result although I tried > alot change the number of hidden layers and their neurons account , why ? > what's a suitable back propagation training function ? or did I create mynet1 > in a wrong way ? How to get best classification result

See above

> 6- When should I do minmax(P) instaed of inserting P directly?

Never. That is a more obsolete code than the obsolete code you are using.

With your version do not include the number of output nodes in the function call. It will be automatically obtained from T.

help newff

doc newff

> I hope to hear from someone ASAP . Thank you in advance

Hope this helps.

Thank you for formally accepting my answer.

Greg

1 Comment
Show -1 older comments Hide -1 older comments

aurora on 7 Nov 2012

Edited: aurora on 7 Nov 2012

Thank you very much .

As for trainscg I tried it and many other training functions but the best was trainlm .

As for the confusion matrix, I always read the help before I ask but I'm just new in neural network concepts, my question was do the percentages of confusion matrix represent the recognition rate ? or there's another calculations to evaluate the neural network performance?

I've some more question, what's the importance of validation dataset , I know its benefit but when I disabled the mynet1.divideFcn = ''; the training performance increases , so what should I do ?

- What do you mean by biased & unbiased in :

MSE00 = mean(var(T',1)) % 0.0622 Biased

MSE00a = mean(var(T') % 0.0628 Unbiased

- what's the percentage of acceptable training performance ? what about 88% ? the validation and testing performance always less than training one, so when evaluating the recognition rate what values should I use ? my high simulation values or my low testing ones?

- Testing dataset must be untrained , right ?

Thank you

Best Regards

Sign in to comment.

Answer 2

Greg Heath on 9 Nov 2012

0 votes

Analyzing results and output plots of neural network

Asked by aurora on 4 Nov 2012 at 1:30

Latest activity Commented on by aurora on 7 Nov 2012 at 17:35

% As for trainscg I tried it and many other training functions but the best % was trainlm .

Interesting.

% As for the confusion matrix, I always read the help before I ask but I'm % just new in neural network concepts, my question was do the percentages % of confusion matrix represent the recognition rate ? or there's another % calculations to evaluate the neural network performance?

The code I gave you outputs the indices of the true class and the assigned class.

That is all the info you need.

Just compare and count the errors for each class and how the errors are distributed among the other classes.

Then compare your calculations with the confusion matrix.

The diagonal components are correct regognition rates.

% I've some more question, what's the importance of validation dataset ,

Use of a validation set during training mitigates the folly of minimizing training set error at the expense of increasing error on nontraining data. Search

Early Stopping and Stopped Training in the comp.ai.neural-nets FAQ and archived posts.

Validation data is also used to rank multiple designs. Ideally, in this case, either the validation data is not used for Early Stopping or the validation data is split into an Early Stopping subset and a ranking subset. Nevertheless, it is not uncommon to see one set used for both.

% I know its benefit but when I disabled the mynet1.divideFcn = ''; the % training performance increases , so what should I do ?

Plot your training, validation and testing subsets to make sure that they can be consideed to come from the same overall probability distribution. Sometimes just comparing means and variances are sufficient Sometimes a reshuffling of the data makes sense.

Then trust your nontraining data performance estimates.

% - What do you mean by biased & unbiased in :

% MSE00 = mean(var(T',1)) % 0.0622 Biased % % MSE00a = mean(var(T') % 0.0628 Unbiased

I won't answer that. Instead I'll ask you a question:

When calculating variance do you divide by N or N-1. Why?

help var

doc var

% - what's the percentage of acceptable training performance ? %what about 88% ?

It depends on the data.

I typically design 100 or more nets using, for example, 10 different choices for number of hidden nodes and 10 different random weight initializations for each value of H.

... Readily accomplished in a double for loop.

% the validation and testing performance always less than training % one, so when evaluating the recognition rate what values should I use ? % my high simulation values or my low testing ones?

Training data for weight estimation

Validation data for Early Stopping and/or multiple design ranking

Test data to estimate the performance on unseen data.(AKA "Generalization Error")

If you were buying a $100,000 classifier from another company, which error rate would interest you more?

% - Testing dataset must be untrained , right ?

Yes. How else can you get an honest (unbiased) estimate of performance on unseen data?

A common practice is to use 10-fold crossvalidation.

Search in the CANN FAQ and archives for details.

In my posts I typically use the abbreviation XVAL.

Hope this helps.

Greg

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

Analyzing results and output plots of neural network

0 Comments
Show -2 older comments Hide -2 older comments

Accepted Answer

1 Comment
Show -1 older comments Hide -1 older comments

More Answers (1)

0 Comments
Show -2 older comments Hide -2 older comments

Categories

Tags

Community Treasure Hunt

Analyzing results and output plots of neural network

0 Comments Show -2 older comments Hide -2 older comments

Accepted Answer

1 Comment Show -1 older comments Hide -1 older comments

More Answers (1)

0 Comments Show -2 older comments Hide -2 older comments

Categories

Tags

See Also

Community Treasure Hunt

0 Comments
Show -2 older comments Hide -2 older comments

1 Comment
Show -1 older comments Hide -1 older comments

0 Comments
Show -2 older comments Hide -2 older comments