How can I get the derivative of a complex function including reshape, max operation?

2 views (last 30 days)
I have a function name I want to optimize with to minimize this function
And I want to provide the gradient of function to speed up the optimization.
The definition of is
function [F] = f(x)
% size(x) = [a*b, 1]
Z = reshape(x, [a, b]); % size(Z)= [a, b]
% size(A) = [c, a]
B = A * Z; % size(B) = [c, b]
% size(C) = [c, b]
D = sum(B.*C, 2); % [c, 1]
E = abs(D).^2; % [c, 1]
F = max(E); % scalar, my objective
end
I want to use chain rules, but I have the following questions:
  1. I cannot figure an easy way to get the jacobian of , i.e. the reshape function.
  2. the jacobian of
  3. the jacobian of function.
Thank you for your help!!

Accepted Answer

Matt J
Matt J on 28 Jan 2021
Edited: Matt J on 29 Jan 2021
The Jacobian of B(x) is
J1=kron(speye(b),A);
The Jacobian of D(B) is,
J2=kron(ones(1,b),speye(c));
J2(logical(J2))=C;
The max(E) function is not differentiable everywhere, so it is not clear how well fmincon will handle this problem, but at the points where it is differentiable, the Jacobian is,
N=length(E);
e=zeros(1,N);
[~,idx]=max(E);
e(idx)=1;
J3=spdiags(e(:),0,N,N);
  8 Comments
goose neuralnet
goose neuralnet on 29 Jan 2021
Thank you!
I think I fully understand your answer!!!
But I have the following questions:
  1. How can I derive these Jacobian on my own. I can verify your answer for other jacobians and I derive by myself. But I don't have a systematic way of dong this. Can you provide some material for me?
  2. Although my objective F and variable x is real, but the middle consequences can be complex. So if I have complex numbers, how can I compute the jacobian?
Thank you very much!!! I'm already statisfied.
Matt J
Matt J on 29 Jan 2021
Edited: Matt J on 29 Jan 2021
Thank you very much!!! I'm already statisfied.
I'm glad, but I think my other answer is better. As you'll see there, it reaches a much more compact and efficient expression for the final gradient.
can verify your answer for other jacobians and I derive J1 by myself. Can you provide some material for me?
I already did in my earlier comments. It's just a basic Kronecker product theorem that I gave you a wiki link for.
Although my objective F and variable x is real, but the middle consequences can be complex. So if I have complex numbers, how can I compute the jacobian?
The caclulus will depend on the expression.

Sign in to comment.

More Answers (1)

Matt J
Matt J on 29 Jan 2021
Edited: Matt J on 29 Jan 2021
Another way to see the calculation is that if
[F,i]=max(E)
then, assuming we are at a point of differentiability, F is given locally by F=D(i)^2, or,
ai=A(i,:).';
ci=C(i,:).';
F=( ai.'*Z*ci )^2
Since this is satisfied in a local neighborhood of Z, we can readily take the gradient of the expression on the right-hand side. The result, when shaped as a size [a,b] matrix, is,
Fgradient = 2*(ai.'*Z*ci)*ai*ci.'
  3 Comments
goose neuralnet
goose neuralnet on 1 Feb 2021
Edited: goose neuralnet on 1 Feb 2021
Sorry, I come back for help again.
I use fmincon with options: 'CheckGradients',true. And matlab tells me my grad is wrong. But I have checked several times and cannot figure where the problem is.
I explain my solution in the code and thank you very much!
function [F, Jacob] = f(x)
% size(x) = [a*b, 1]
Z = reshape(x, [a, b]); % size(Z)= [a, b]
% size(A) = [c, a]
B = A * Z; % size(B) = [c, b]
% size(C) = [c, b]
D = sum(B.*C, 2); % [c, 1]
E = abs(D); % [c, 1]
[F, I] = -min(E); % scalar, my objective
% J1 = kron(speye(b),A);
% J2=kron(ones(1,b),speye(c));
% J2(logical(J2))=C;
% Assume F = E(I) for current points
% E(I) = sqrt(real(D(I)).^2 + imag(D(I)).^2);
% abs_square = 1/norm(E(I));
% Jacobian of E(I) w.r.t to real(D(I)) is 2*real(D(I))*abs_square
% Jacobian of E(I) w.r.t to imag(D(I)) is 2*imag(D(I))*abs_square
% D = diag(A * Z * C.')
% D(I) = A(I, :) * Z * C(I,:).'
% real(D(I)) = real(ai*Z*ci.') % my ai, ci are row vector, sorry for the confusion
% imag(D(I)) = imag(ai*Z*ci.')
% I assure my Z = reshape(x, [a, b]) is a real matrix, hence
% real(D(I)) = real(ai)*Z*real(ci.') - imag(ai)*Z*imag(ci.')
% imag(D(I)) = real(ai)*Z*imag(ci.') + imag(ai)*Z*real(ci.')
% Hence,
% Jacobian of real(D(I)) w.r.t to x is kron(real(ci), real(ai)) - kron(imag(ci), imag(ai));
% Jacobian of image (D(I)) w.r.t to x is kron(real(ci), imag(ai)) + kron(imag(ci), real(ai));
if nargout > 1 % gradient required
ai=tdlmat(I,:); % size(ai) = [1, J]
ci=A_theta(I,:); % size(ci) = [1, M]
J_real = kron(real(ci), real(ai)) - kron(imag(ci), imag(ai));
J_imag = kron(real(ci), imag(ai)) + kron(imag(ci), real(ai));
abs_square = 1/norm(E(I));
Jacob = 2*real(E(I))*abs_square*J_real + 2*imag(E(I))*abs_square*J_imag;
Jacob = Jacob.';
end
end

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!