quadprog and fmincon provide very different results on the same problem

4 views (last 30 days)
Dear all,
I am trying to solve a quadratic minimization problem with quadprog, but I obtain results that are very different from fmincon. I do not understand why. My problem is the following:
where || || is the Euclidean norm. I have written the following code (with quantities similar to my problem):
% Setup values (just as example)
n = 4;
AA = [12, -1, 0, 0; ...
0, 0, 111, -1];
bb = [0; 0];
x = [1; 10; 1; 61];
% options for fmincon
optionsFmincon = optimset('LargeScale', 'off', ...
'Algorithm', 'interior-point', ...
'UseParallel', 'never', ...
'MaxFunEvals', 20000, ...
'GradObj', 'on', ...
'TolX', 1e-9, ...
'TolFun', 1e-9, ...
'TolCon', 1e-9, ...
'MaxIter', 10000, ...
'Display', 'none' ...
);
% options for quadprog
optionsQuadprog = optimset('Algorithm', 'interior-point-convex', ...
'TolX', 1e-12, ...
'TolFun', 1e-12, ...
'TolCon', 1e-12, ...
'MaxIter', 10000, ...
'Display', 'none' ...
);
% Fmincon minimization
lowerBound = -1000000000000000*ones(params.n, 1); %fictitious lower and upper bounds
upperBound = +1000000000000000*ones(params.n, 1);
[out1, Jopt1, exitflag1, output1] = fmincon(@(y) cost(x, y, n), x, [], [], [], [], lowerBound, upperBound, @(y) constraints(y, AA, bb), optionsFmincon);
% Solution with quadprog
H = eye(n);
f = -2*eye(n)*x;
A = [];
b = [];
Aeq = AA;
beq = bb;
lb = [];
ub = [];
x0 = [];
[out, Jopt, exitflag, output] = quadprog(H, f, A, b, Aeq, beq, lb, ub, x0, optionsQuadprog);
% Comparison of results
[out1, out]
[out1'*H*out1-2*x'*eye(n)*out1+x'*eye(n)*x, out'*H*out-2*x'*eye(n)*out+x'*eye(n)*x]
[norm(x-out1)^2, norm(x-out)^2]
[norm(AA*out1-bb), norm(AA*out-bb)]
% cost function for fmincon
function [J, gradient] = cost(x, y, n)
J = (x-y)'*eye(n)*(x-y); % equivalent to J = y'*eye(n)*y - (2*eye(n)*x)'*y; since I can omit x*eye(n)*x
gradient = -2*(x-y);
end
% constraints function for fmincon
function [c, ceq] = constraints(x, AA, bb)
ceq = AA*x-bb;
c = [];
end
If you execute the code, you get very different solutions. The optimal cost of the fmincon is equal to 0.23048, while the optimal cost of quadprog is 3823 (therefore, also the values of the optimal y are very different). In both cases the constraints are satisfied.
Probably I am making something wrong, but I do not understand what.
Any help from you is really appreciated. Thank you in advance
  2 Comments
John D'Errico
John D'Errico on 25 Jan 2019
Why in the name of god and little green apples do you provide linear equality constraints to fmincon via the nonlinear constraints? Start with that. fmincon accepts linear equality constraints.
Next, you have essentially no lower or upper bounds, since they are set as numerically +/- inf.
John D'Errico
John D'Errico on 25 Jan 2019
Edited: John D'Errico on 25 Jan 2019
Please don't answer your own question just to provide addtional information. Moved to a comment:
Thank you for your suggestions. You are definitely right. I have changed my code (see below) according to your suggestions. However, the results are exactly the same as before (a part from the increased execution speed, of course).
% Setup values (just as example)
n = 4;
AA = [12, -1, 0, 0; ...
0, 0, 111, -1];
bb = [0; 0];
x = [1; 10; 1; 61];
% options for fmincon
optionsFmincon = optimset('LargeScale', 'off', ...
'Algorithm', 'interior-point', ...
'UseParallel', 'never', ...
'MaxFunEvals', 20000, ...
'DerivativeCheck', 'off', ...
'GradObj', 'on', ...
'TolX', 1e-9, ...
'TolFun', 1e-9, ...
'TolCon', 1e-9, ...
'MaxIter', 10000, ...
'Display', 'none' ...
);
% options for quadprog
optionsQuadprog = optimset('Algorithm', 'interior-point-convex', ...
'TolX', 1e-12, ...
'TolFun', 1e-12, ...
'TolCon', 1e-12, ...
'MaxIter', 10000, ...
'Display', 'none' ...
);
% Fmincon minimization
lowerBound = -Inf*ones(n, 1);
upperBound = +Inf*ones(n, 1);
[out1, Jopt1, exitflag1, output1] = fmincon(@(y) cost(x, y, n), x, [], [], AA, bb, lowerBound, upperBound, [], optionsFmincon);
% Solution with quadprog
H = eye(n);
f = -2*eye(n)*x;
A = [];
b = [];
lb = [];
ub = [];
x0 = [];
[out, Jopt, exitflag, output] = quadprog(H, f, A, b, AA, bb, lb, ub, x0, optionsQuadprog);
% Comparison of results
[out1, out]
[out1'*H*out1-2*x'*eye(n)*out1+x'*eye(n)*x, out'*H*out-2*x'*eye(n)*out+x'*eye(n)*x]
[norm(x-out1)^2, norm(x-out)^2]
[norm(AA*out1-bb), norm(AA*out-bb)]
% cost function for fmincon
function [J, gradient] = cost(x, y, n)
J = (x-y)'*eye(n)*(x-y); % equivalent to J = y'*eye(n)*y - (2*eye(n)*x)'*y; since I can omit x*eye(n)*x
gradient = -2*(x-y);
end
Thus, the issues of my previous question still remain..

Sign in to comment.

Accepted Answer

John D'Errico
John D'Errico on 25 Jan 2019
Edited: John D'Errico on 26 Jan 2019
Why are you doing this? Really, why are you trying to solve it using that kluge of approaches? Go simple. Don't over-complicate things.
n = 4;
AA = [12, -1, 0, 0; ...
0, 0, 111, -1];
bb = [0; 0];
x = [1; 10; 1; 61];
Lets see. Minimize norm(x-y), subject to AA*y = bb. Trivial, using the proper tool. In the optimization TB, that is lsqlin. As I show below, my own LSE also gives it directly, found on the file exchange.
Ylsqlin = lsqlin(eye(4),x,[],[],AA,bb)
Minimum found that satisfies the constraints.
Optimization completed because the objective function is non-decreasing in
feasible directions, to within the default value of the optimality tolerance,
and constraints are satisfied to within the default value of the constraint tolerance.
<stopping criteria details>
Ylsqlin =
0.834482758578957
10.0137931029475
0.549586106124114
61.0040577797767
Verification:
AA*Ylsqlin - bb
ans =
-1.77635683940025e-15
0
The sum of squares of the difference was:
norm(x - Ylsqlin)^2
ans =
0.230475348269706
Using my own LSE for comparison, found on the file exchange.
lse(eye(4),x,AA,bb)
ans =
0.83448275862069
10.0137931034483
0.549586106151599
61.0040577828275
Now, how would it be solved using quadprog? The sum of squares of differences is:
(x-y)'*eye(4)*(x-y)
Expanding, we get
x'*x + y'*y - 2*x'*y
x'*x is a constant, so irrelevant to the optimization, BUT it explains why your objectives are so different.
x'*x
ans =
3823
To convert this objective to a quadprog usable form, we want it as:
FVAL = 0.5*X'*H*X + f'*X.
So
H = 2*eye(4);
F = -2*x';
Now solve.
Yqp = quadprog(H,F,[],[],AA,bb)
Minimum found that satisfies the constraints.
Optimization completed because the objective function is non-decreasing in
feasible directions, to within the default value of the optimality tolerance,
and constraints are satisfied to within the default value of the constraint tolerance.
<stopping criteria details>
Yqp =
0.834482758599826
10.0137931031979
0.549586106137858
61.0040577813022
It should be no surprise this is the same as lsqlin, but it took more thought to write. Use the proper tools. And now, you want to use fmincon.
fun = @(y) norm(x-y);
x0 = rand(4,1);
Yfmincon = fmincon(fun,x0,[],[],AA,bb)
Local minimum found that satisfies the constraints.
Optimization completed because the objective function is non-decreasing in
feasible directions, to within the default value of the optimality tolerance,
and constraints are satisfied to within the default value of the constraint tolerance.
<stopping criteria details>
Yfmincon =
0.834482752432331
10.013793029188
0.549586102080044
61.0040573308849
So esentially identical. And very easy to write in all cases, though lsqlin was by far the simplest.
Anyway, there was absolutely NO need to provide a gradient function to fmincon. (In fact, there is almost never a need to provide a gradient.) A waste of time certainly in this case. Nor even an m-file cost function. The other differences in your results I would put down to simple algebra mistakes that are hardly worth debugging, because you massively over-complicated the problem.
  1 Comment
Mauro
Mauro on 26 Jan 2019
Edited: Mauro on 26 Jan 2019
Thank you for your detailed answer and suggestions. The error in my code was just having written
H = eye(n);
instead of
H = 2*eye(n);
With such a change, I obtain the same results with both fmincon and quadprog.
I agree with you that the problem is very simple and could be solved also by using constrained least squares. Toward this end, I have performed some tests using all the approaches, and I can conclude that lsqlin and quadprog require more or less the same execution time (probably because both use an interior-point method?), while fmincon needs more time (probably because it is a general-purpose function).
The only thing I do not agree in your reply is the fact that providing the gradient of the cost to fmincon is useless. I have performed several tests with and without the gradient (in this example and also in other ones in the past), and I noticed a large reduction of the computational time most of the times (in my code, the minimization is called several times and not only once). Clearly, the gradient must be computed writing an efficient code, otherwise its numerical evaluation is faster.
Anyway, you solved my issues, so thank you very much again for your replies!

Sign in to comment.

More Answers (0)

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!