Patternnet for multi-dimensional classification problem

7 views (last 30 days)
Hi,
I am trying to develop a neural network to model workplace choices of individuals. I have framed it as a classification problem where I give certain inputs (156 neurons) and I want it to classify my inputs into different zones and different industries at the same time. I have 151 zones and 11 industries to work with.
Given this set of input neurons, I want my target to receive a 1 corresponding to one of these 151 zones and 0 for others; and I want it to receive 1 corresponding to one of these 11 industries and 0 for others. However, patternnet does not seem to allow targets in this format.
Is there a way in which such a classification problem can be handled by a single neural net using patternnet? Can the targets be 3D or higher dimensional matrices (in the above case a 151 x 11 x no. of samples)?
I could always look at this as a classification problem with 151*11 classes, but I hate to increase the size of the neural net to such an extent.
In case, I cannot handle this using patternnet, what are other options for doing it? Can I use newff and apply hardlim to the network outputs (which will be 151 + 11, in this case)?
If you have any ideas, please answer at your earliest convenience. Any help appreciated!
Thanks in advance.
  1 Comment
Greg Heath
Greg Heath on 31 May 2013
The hidden and output nodes are called neurons because their values are the outputs of transfer functions.
The input nodes are not neurons. They are fan-in units.
That is why a net with a single hidden layer is called a two-layer net.
I avoid confusion by referring to "number of hidden layers" instead of "number of layers" and "nodes" instead of "neurons"
Hope this helps.
Greg

Sign in to comment.

Accepted Answer

Greg Heath
Greg Heath on 31 May 2013
Use a 162 x N target matrix with two ones in every column.
If patternnet does not allow that, use feedforwardnet with 'trainscg'.
Use logsig instead of softmax in the output.
You can use vec2ind and ind2vec on target(1:11,:) and target(12:162,:)
Hope this helps.
Thank you for formally accepting my answer
Greg
  1 Comment
Trushna
Trushna on 31 May 2013
Dear Greg,
Thank you for the help. It seems to work but I'm now stuck up with some other issue.
Here is a simple network that I trained using feedforwardnet. My inputs and outputs are of dimension 151 x 40247. It is only a classic pattern classification problem so that we can directly compare the results with patternnet.
The code:
I = load(input);
O = load(output);
P = transpose(I);
T = transpose(O);
M = transpose(O);
for i = 1:151
for j = 1:40247
if(T(i,j)==0)
T(i,j)=0.1;
else
T(i,j)=0.9;
end
end
end
net = feedforwardnet(30, 'trainscg');
net.trainParam.epochs = 1000;
net.trainParam.max_fail = 6;
net.layers{2}.transferFcn = 'logsig';
net = train(net,P,T);
y = net(P);
perf = perform(net,T,y);
classes = vec2ind(y);
classes = ind2vec(classes);
err = abs(M - classes);
sum = sum(sum(err));
misc = (sum/2)/40247;
Somehow the performance when the transfer function for output layer is 'purelin' is much better than when it is 'logsig'. Is there some error in the code? Why should I have such different results?
Also, is there a specific reason for using 'logsig' in the output layer, as was previously suggested?

Sign in to comment.

More Answers (1)

Greg Heath
Greg Heath on 3 Jun 2013
%Thank you for the help. It seems to work but I'm now stuck up with some other issue. Here is a simple network that I trained using feedforwardnet. My inputs and outputs are of dimension 151 x 40247. It is only a classic pattern classification problem so that we can directly compare the results with patternnet.
Patternnet calls feedforwardnet. The only differences are the default training functions (trainscg vs trainlm) and the plot functions. Therefore, this will prove nothing.
For the record, fitnet also calls feedforwardnet. The only difference is that fitnet has an extra plot. Therefore there is no reason to ever use feedforwardnet.
(Similar story for the obsolete functions newfit, newpr and newff)
% for i = 1:151
% for j = 1:40247
% if(T(i,j)==0)
% T(i,j)=0.1;
% else
% T(i,j)=0.9;
% end
% end
% end
Delete the (0,1) ==> (0.1,0.9) loop. It is totally unnecessary and leads to inferior results. In addition, the outputs will not be valid estimates of class posterior probabilities.
% net = feedforwardnet(30, 'trainscg');
This makes it equivalent to patternnet except for the plot functions
Where did the H=30 come from?. For robustness and better generalization, better to try to minimize H. Use a double loop as in the many codes that I have posted. Search using greg Ntrials
% net.trainParam.epochs = 1000;
% net.trainParam.max_fail = 6;
Delete the above two commands. These are defaults
% net.layers{2}.transferFcn = 'logsig';
Either delete this or remove the default { -1, 1} mapminmax output transformation
% net = train(net,P,T);
% y = net(P);
% perf = perform(net,T,y);
% classes = vec2ind(y);
% classes = ind2vec(classes);
% err = abs(M - classes);
% sum = sum(sum(err));
% misc = (sum/2)/40247;
Always better to have the details of the training record tr:
[ net tr Y ] = train(net,P,T);
To see what goodies are in tr, type
tr = tr
MSE = mse(T-Y)
trueclass = vec2ind(T);
assignedclass = vec2ind(Y);
err = assignedclass~=trueclass;
Nerr = sum(err)
PctErr = 100*Nerr/N
% Somehow the performance when the transfer function for output layer is 'purelin' is much better than when it is 'logsig'. Is there some error in the code? Why should I have such different results?
You used the default {-1,1} output transformation
% Also, is there a specific reason for using 'logsig' in the output layer, as was previously suggested?
In older versions of newff, there were no default normalizations. I obtained consistently good posterior probability estimates with softmax when classes were mutually exclusive (O = 151) and logsig otherwise (O = 151+11). The probability estimates provide the basis for confidence bounds. Search the newsgroup, and comp.ai.neural-nets using greg softmax or heath softmax.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!