Patternnet for multi-dimensional classification problem

Question

Trushna on 31 May 2013

0
Link

Direct link to this question

https://www.mathworks.com/matlabcentral/answers/77559-patternnet-for-multi-dimensional-classification-problem

Hi,

I am trying to develop a neural network to model workplace choices of individuals. I have framed it as a classification problem where I give certain inputs (156 neurons) and I want it to classify my inputs into different zones and different industries at the same time. I have 151 zones and 11 industries to work with.

Given this set of input neurons, I want my target to receive a 1 corresponding to one of these 151 zones and 0 for others; and I want it to receive 1 corresponding to one of these 11 industries and 0 for others. However, patternnet does not seem to allow targets in this format.

Is there a way in which such a classification problem can be handled by a single neural net using patternnet? Can the targets be 3D or higher dimensional matrices (in the above case a 151 x 11 x no. of samples)?

I could always look at this as a classification problem with 151*11 classes, but I hate to increase the size of the neural net to such an extent.

In case, I cannot handle this using patternnet, what are other options for doing it? Can I use newff and apply hardlim to the network outputs (which will be 151 + 11, in this case)?

If you have any ideas, please answer at your earliest convenience. Any help appreciated!

Thanks in advance.

1 Comment
Show -1 older commentsHide -1 older comments

Greg Heath on 31 May 2013

The hidden and output nodes are called neurons because their values are the outputs of transfer functions.

The input nodes are not neurons. They are fan-in units.

That is why a net with a single hidden layer is called a two-layer net.

I avoid confusion by referring to "number of hidden layers" instead of "number of layers" and "nodes" instead of "neurons"

Hope this helps.

Greg

Sign in to comment.

Sign in to answer this question.

Answer 1

Greg Heath on 31 May 2013

0
Link

Direct link to this answer

https://www.mathworks.com/matlabcentral/answers/77559-patternnet-for-multi-dimensional-classification-problem#answer_87235

Use a 162 x N target matrix with two ones in every column.

If patternnet does not allow that, use feedforwardnet with 'trainscg'.

Use logsig instead of softmax in the output.

You can use vec2ind and ind2vec on target(1:11,:) and target(12:162,:)

Hope this helps.

Thank you for formally accepting my answer

Greg

1 Comment
Show -1 older commentsHide -1 older comments

Trushna on 31 May 2013

Dear Greg,

Thank you for the help. It seems to work but I'm now stuck up with some other issue.

Here is a simple network that I trained using feedforwardnet. My inputs and outputs are of dimension 151 x 40247. It is only a classic pattern classification problem so that we can directly compare the results with patternnet.

The code:

I = load(input);

O = load(output);

P = transpose(I);

T = transpose(O);

M = transpose(O);

for i = 1:151

    for j = 1:40247
        if(T(i,j)==0)
            T(i,j)=0.1;
        else 
            T(i,j)=0.9;
        end 
    end

end

net = feedforwardnet(30, 'trainscg');

net.trainParam.epochs = 1000;

net.trainParam.max_fail = 6;

net.layers{2}.transferFcn = 'logsig';

net = train(net,P,T);

y = net(P);

perf = perform(net,T,y);

classes = vec2ind(y);

classes = ind2vec(classes);

err = abs(M - classes);

sum = sum(sum(err));

misc = (sum/2)/40247;

Somehow the performance when the transfer function for output layer is 'purelin' is much better than when it is 'logsig'. Is there some error in the code? Why should I have such different results?

Also, is there a specific reason for using 'logsig' in the output layer, as was previously suggested?

Sign in to comment.

Answer 2

Greg Heath on 3 Jun 2013

1
Link

Direct link to this answer

https://www.mathworks.com/matlabcentral/answers/77559-patternnet-for-multi-dimensional-classification-problem#answer_87524

%Thank you for the help. It seems to work but I'm now stuck up with some other issue. Here is a simple network that I trained using feedforwardnet. My inputs and outputs are of dimension 151 x 40247. It is only a classic pattern classification problem so that we can directly compare the results with patternnet.

Patternnet calls feedforwardnet. The only differences are the default training functions (trainscg vs trainlm) and the plot functions. Therefore, this will prove nothing.

For the record, fitnet also calls feedforwardnet. The only difference is that fitnet has an extra plot. Therefore there is no reason to ever use feedforwardnet.

(Similar story for the obsolete functions newfit, newpr and newff)

 % for i = 1:151
 %     for j = 1:40247        
 %         if(T(i,j)==0)            
 %             T(i,j)=0.1;        
 %         else
 %             T(i,j)=0.9;        
 %         end
 %     end
 % end

Delete the (0,1) ==> (0.1,0.9) loop. It is totally unnecessary and leads to inferior results. In addition, the outputs will not be valid estimates of class posterior probabilities.

% net = feedforwardnet(30, 'trainscg');

This makes it equivalent to patternnet except for the plot functions

Where did the H=30 come from?. For robustness and better generalization, better to try to minimize H. Use a double loop as in the many codes that I have posted. Search using greg Ntrials

 % net.trainParam.epochs = 1000; 
 % net.trainParam.max_fail = 6;

Delete the above two commands. These are defaults

% net.layers{2}.transferFcn = 'logsig';

Either delete this or remove the default { -1, 1} mapminmax output transformation

 % net = train(net,P,T);
 % y = net(P);
 % perf = perform(net,T,y);
 % classes = vec2ind(y);
 % classes = ind2vec(classes);
 % err = abs(M - classes);
 % sum = sum(sum(err));
 % misc = (sum/2)/40247;

Always better to have the details of the training record tr:

[ net tr Y ] = train(net,P,T);

To see what goodies are in tr, type

tr = tr

MSE = mse(T-Y)

trueclass = vec2ind(T);

assignedclass = vec2ind(Y);

err = assignedclass~=trueclass;

Nerr = sum(err)

PctErr = 100*Nerr/N

% Somehow the performance when the transfer function for output layer is 'purelin' is much better than when it is 'logsig'. Is there some error in the code? Why should I have such different results?

You used the default {-1,1} output transformation

% Also, is there a specific reason for using 'logsig' in the output layer, as was previously suggested?

In older versions of newff, there were no default normalizations. I obtained consistently good posterior probability estimates with softmax when classes were mutually exclusive (O = 151) and logsig otherwise (O = 151+11). The probability estimates provide the basis for confidence bounds. Search the newsgroup, and comp.ai.neural-nets using greg softmax or heath softmax.

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Patternnet for multi-dimensional classification problem

1 Comment
Show -1 older commentsHide -1 older comments

Accepted Answer

1 Comment
Show -1 older commentsHide -1 older comments

More Answers (1)

0 Comments
Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Community Treasure Hunt

Patternnet for multi-dimensional classification problem

1 Comment Show -1 older commentsHide -1 older comments

Accepted Answer

1 Comment Show -1 older commentsHide -1 older comments

More Answers (1)

0 Comments Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Community Treasure Hunt

1 Comment
Show -1 older commentsHide -1 older comments

1 Comment
Show -1 older commentsHide -1 older comments

0 Comments
Show -2 older commentsHide -2 older comments