Arrays have incompatible sizes for this operation..

Question

Nick on 8 Aug 2022

0
Link

Direct link to this question

https://www.mathworks.com/matlabcentral/answers/1775560-arrays-have-incompatible-sizes-for-this-operation

Commented: Walter Roberson on 8 Aug 2022

Arrays have incompatible sizes for this operation. Error is n2 = W2'.*a1 + b2; a2 = sig2(n2);

My code:

for epoch = 1:N_ep

mixup = randperm(N_train);

for j = 1:N_train

i = mixup(j);

% get X_train(:,i) as an input to the network

a1 = X_train(:,i);

% forward prop to the next layer, activate it, repeat

n2 = W2'.*a1 + b2; a2 = sig2(n2);

n3 = W3'.*a2 + b3; a3 = sig3(n3);

n4 = W4'.*a3 + b4; a4 = sig4(n4);

n5 = W5'.*a4 + b5; a5 = sig5(n5);

% this is then the output

y = a5;

4 Comments
Show 2 older commentsHide 2 older comments

Nick on 8 Aug 2022

function [Yn On Yt Ot wt] = demo(lr, N_ep, tsf, af, pichoice)

% lr is the learning rate, N_ep the number of epochs, tsf training size fraction

% af = 0 for sigmoid code, af not zero for ReLU code

% pichoice = 0 (or 1) for TSE (or XE)

% Y/O are the exact/predicted labels/targets (n=train, t=test); wt is test success

% MNIST CSV files not to be altered, and in the same folder as matlab code.

set(0,'DefaultLineLineWidth', 2);

set(0,'DefaultLineMarkerSize', 10);

% As a professional touch we should test the validity of our input

% if or(N_ep <= 0, lr <= 0 , tsf <= 0 )

% error('N_ep, lr, and/or tsf are not valid')

% end

% As a professional touch we should test the validity of our input

if or(N_ep <= 0, lr <= 0)

error('N_ep and/or lr are not valid')

end

if tsf <= 0

error('tsf choice is not valid')

end

if ~ismember(pichoice,[0,1])

error('performance index choice is not valid')

end

A_1 = readmatrix('MNIST_train_1000.csv');

% convert and NORMALIZE it into training inputs and target outputs

X_train = A_1(:,2:end)'/255; % beware - transpose, data is in columns!

N_train = size(X_train,2); % size(X_train,1/2) gives number of rows/columns

Y_train = zeros(10,N_train);

% set up the one-hot encoding - note that we have to increment by 1

for i=1:N_train

Y_train(1+A_1(i,1),i) = 1;

end

% default variables

D = 36;

Ni = 784; % number of input nodes

No = 10; % number of output nodes

% set up weights and biases

W2 = 0.5-rand(784,24);b2 = zeros(24,1);

W3 = 0.5-rand(26,37); b3 = zeros(37,1);

W4 = 0.5-rand(37,31); b4 = zeros(31,1);

W5 = 0.5-rand(31,10); b5 = zeros(10,1);

% set up a sigmoid activation function for layers 2 and 3

if af == 0;

sig2 = @(x) 1./(1+exp(-x));

dsig2 = @(x) exp(-x)./((1+exp(-x)).^2);

sig3 = @(x) 1./(1+exp(-x));

dsig3 = @(x) exp(-x)./((1+exp(-x)).^2);

sig4 = @(x) 1./(1+exp(-x));

dsig4 = @(x) exp(-x)./((1+exp(-x)).^2);

elseif af == 1;

sig2 = @(x) max(0,x);

dsig2 = @(x) 0*(x<0)+1*(x>=0);

sig3 = @(x) max(0,x);

dsig3 = @(x) 0*(x<0)+1*(x>=0);

sig4 = @(x) max(0,x);

dsig4 = @(x) 0*(x<0)+1*(x>=0);

else

error('af has improper value')

end

if and (pichoice == 1, af == 0)

sig5 = @(x) 1./(1+exp(-x));

dig = @(x) exp(-x)./((1+exp(-x)).^2);

elseif and (pichoice == 1, af == 1)

sig5 = @(x) 1./(1+exp(-x));

dsig5 = @(x) exp(-x)./((1+exp (-x)).^2);

elseif and (pichoice == 2 , af == 0)

sig5 = @(x) exp(x)/sum(exp(x));

elseif and (pichoice == 2 , af == 1)

sig5 = @(x) exp(x)/sum(exp(x));

end

% we'll calculate the performance index at the end of each epoch

pivec = zeros(1,N_ep); % row vector

% we now train by looping N_ep times through the training set

for epoch = 1:N_ep

mixup = randperm(N_train);

for j = 1:N_train

i = mixup(j);

% get X_train(:,i) as an input to the network

a1 = X_train(:,i);

% forward prop to the next layer, activate it, repeat

n2 = W2'.*a1 + b2; a2 = sig2(n2);

n3 = W3'.*a2 + b3; a3 = sig3(n3);

n4 = W4'.*a3 + b4; a4 = sig4(n4);

n5 = W5'.*a4 + b5; a5 = sig5(n5);

% this is then the output

y = a5;

% calculate A, the diagonal matrices of activation derivatives

A2 = diag(dsig2(n2));

A3 = diag(dsig3(n3));

A4 = diag(dsig4(n4));

% we calculate the error in this output, and get the S5 vector

e5 = Y_train(:,i) - y;

if pichoice == 0;

A5 = diag(dsig3(n5)); S5 = -2*A5*e5;

elseif pichoice == 1;

S5 = -e5;

end

% back prop the error

S2 = A2*W3*S3;

S3 = A3*W4*S4;

S4 = A4*W5*S5;

% and use a learning rate to update weights and biases

W2 = W2 - lr * a1*S2'; b2 = b2 - lr * S2;

W3 = W3 - lr * a2*S3'; b3 = b3 - lr * S3;

W4 = W4 - lp * a3*S4'; b4 = b4 - lr * S4;

W5 = W5 - lr * a4*S5'; b5 = b5 - lr * S5;

end

% calculate the sum of squared errors and store for plotting

for i=1:N_train

y = sig5(W5'*(sig4(W3'*sig2(W2'*sig2(W2'*X_train(:,i)+b2)))) + b5);

if pichoice == 0;

err = Y_train(:,i) - y;

% each error is itself a vector - hence the norm ||err||

pivec(epoch) = pivec(epoch) + norm(err,2)^2;

elseif pichoice == 1;

xent = -sum(Y_train(:,i).*log(y));

pivec(epoch) = pivec(epoch) + xent;

end

% add a subplot; plot the performance index vs (row vector) epochs

subplot(2,2,1)

plot([1:N_ep],pivec,'b');

xlabel('epochs'); ylabel('performance index');

% loop through the training set and evaluate accuracy of prediction

wins = 0;

y_pred = zeros(10,N_train);

for i = 1:N_train

y_pred(:,i) = sig5(W5'*(sig4(W3'*sig2(W2'*sig2(W2'*X_train(:,i)+b2)))) + b5);

[~, indx1] = max(y_pred(:,i));

[~, indx2] = max(Y_train(:,i));

barcol = 'r';

if indx1 == indx2; wins = wins+1; barcol = 'b';

end

% plot the output 10-vector at top right

subplot(2,2,2); bar(0:9,y_pred(:,i),barcol);

title('predicted output (approximate one-hot)')

% plot the MNIST image at bottom left

B = reshape(1-X_train(:,i),[28,28]);

subplot(2,2,3); imh = imshow(B','InitialMagnification','fit');

subplot(2,2,4);

b = bar(categorical({'Wins','Losses'}), [wins i-wins]);

ylim([0,N_train]);

b.FaceColor = 'flat'; b.CData(1,:) = [1 0 0]; b.CData(2,:) = [0 0 1];

a = get(gca,'XTickLabel'); set(gca,'XTickLabel',a,'fontsize',18)

drawnow;

pause(0.01)

end

fprintf('training set wins = %d/%d, %f%%\n',wins,N_train,100*wins/N_train)

% assign outputs

Yn=Y_train; On=y_pred;

A_2 = readmatrix('MNIST_test_100.csv');

% convert and NORMALIZE it into testing inputs and target outputs

X_test = A_2(:,2:end)'/255;

% the number of data points

N_test = size(X_test,2);

Y_test = zeros(10,N_test);

% set up the one-hot encoding - recall we have to increment by 1

for i=1:N_test

Y_test(1+A_2(i,1),i) = 1;

end

% loop through the test set and evaluate accuracy of prediction

wins = 0;

y_pred = zeros(10,N_test);

for i = 1:N_test

y_pred(:,i) = sig5(W5'*(sig4(W3'*sig2(W2'*sig2(W2'*X_train(:,i)+b2)))) + b5);

[~, indx1] = max(y_pred(:,i));

[~, indx2] = max(Y_test(:,i));

barcol = 'r';

if indx1 == indx2; wins = wins+1; barcol = 'b';

end

% plot the output 10-vector at top right

subplot(2,2,2); bar(0:9,y_pred(:,i),barcol);

title('predicted output (approximate one-hot)')

% plot the MNIST image at bottom left

B = reshape(1-X_test(:,i),[28,28]);

subplot(2,2,3); imh = imshow(B','InitialMagnification','fit');

% animate the wins and losses bottom right

subplot(3,2,4);

b = bar(categorical({'Wins','Losses'}), [wins i-wins]);

ylim([0,N_test]);

b.FaceColor = 'flat'; b.CData(1,:) = [1 0 0]; b.CData(2,:) = [0 0 1];

a = get(gca,'XTickLabel'); set(gca,'XTickLabel',a,'fontsize',18)

drawnow;

pause(0.01)

end

fprintf('testing set wins = %d/%d, %f%%\n',wins,N_test,100*wins/N_test)

% assign outputs

Yt=Y_test; Ot=y_pred;

end

Walter Roberson on 8 Aug 2022

Open in MATLAB Online

A_1 = readmatrix('MNIST_train_1000.csv');

We do not know what size that is returning so we cannot calculate the size of X_train so we cannot calculate the size of a1

Ni   = 784;             % number of input nodes
No   = 10;              % number of output nodes
% set up weights and biases
W2 = 0.5-rand(784,24);b2 = zeros(24,1);

Is it a coincidence that the magic number 784 in the definition of W2 happens to be the same as the value of Ni?

X_train has to have that magic 784 rows for this code to work, but that is not tested anywhere. Instead Ntrain is calculated, implying that the size is not fixed (since it is not immediately followed by testing against 784 or Ni.) You should do one of the following:

reject the file if X_train is not exactly (hard-coded) 784 rows
reject the file if it is not exactly Ni rows
reject the file if it has fewer than 784 or Ni rows, and use only the first 784 or Ni rows if it is larger
set Ni to Ntrain and create the matrices with Ni rows instead of 784.

Sign in to comment.

Sign in to answer this question.

Arrays have incompatible sizes for this operation..

4 Comments
Show 2 older commentsHide 2 older comments

Answers (0)

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

Arrays have incompatible sizes for this operation..

4 Comments Show 2 older commentsHide 2 older comments

Answers (0)

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

4 Comments
Show 2 older commentsHide 2 older comments