quadprog and fmincon provide very different results on the same problem

Question

Mauro on 25 Jan 2019

1
Link

Direct link to this question

https://www.mathworks.com/matlabcentral/answers/441553-quadprog-and-fmincon-provide-very-different-results-on-the-same-problem

Edited: Mauro on 26 Jan 2019

Dear all,

I am trying to solve a quadratic minimization problem with quadprog, but I obtain results that are very different from fmincon. I do not understand why. My problem is the following:

where || || is the Euclidean norm. I have written the following code (with quantities similar to my problem):

% Setup values (just as example)
n = 4;
AA = [12, -1, 0, 0; ... 
      0, 0, 111, -1];
bb = [0; 0];
x = [1; 10; 1; 61];
% options for fmincon
optionsFmincon = optimset('LargeScale', 'off', ...
                          'Algorithm', 'interior-point', ...
                          'UseParallel', 'never', ...
                          'MaxFunEvals', 20000, ...
                          'GradObj', 'on', ...
                          'TolX', 1e-9, ...
                          'TolFun', 1e-9, ...
                          'TolCon', 1e-9, ...
                          'MaxIter', 10000, ...
                          'Display', 'none' ...
                         );
% options for quadprog
optionsQuadprog = optimset('Algorithm', 'interior-point-convex', ...
                           'TolX', 1e-12, ...
                           'TolFun', 1e-12, ...
                           'TolCon', 1e-12, ...
                           'MaxIter', 10000, ...
                           'Display', 'none' ...
                          );
% Fmincon minimization
lowerBound = -1000000000000000*ones(params.n, 1); %fictitious lower and upper bounds
upperBound = +1000000000000000*ones(params.n, 1);
[out1, Jopt1, exitflag1, output1] = fmincon(@(y) cost(x, y, n), x, [], [], [], [], lowerBound, upperBound, @(y) constraints(y, AA, bb), optionsFmincon);
% Solution with quadprog
H = eye(n);
f = -2*eye(n)*x;
A = [];
b = [];
Aeq = AA;
beq = bb;
lb = [];
ub = [];
x0 = [];
[out, Jopt, exitflag, output] = quadprog(H, f, A, b, Aeq, beq, lb, ub, x0, optionsQuadprog);
% Comparison of results
[out1, out]
[out1'*H*out1-2*x'*eye(n)*out1+x'*eye(n)*x, out'*H*out-2*x'*eye(n)*out+x'*eye(n)*x]
[norm(x-out1)^2, norm(x-out)^2]
[norm(AA*out1-bb), norm(AA*out-bb)]
% cost function for fmincon
function [J, gradient] = cost(x, y, n)
J = (x-y)'*eye(n)*(x-y); % equivalent to J = y'*eye(n)*y - (2*eye(n)*x)'*y; since I can omit x*eye(n)*x
gradient = -2*(x-y);
end
% constraints function for fmincon
function [c, ceq] = constraints(x, AA, bb)
ceq = AA*x-bb;
c = [];
end

If you execute the code, you get very different solutions. The optimal cost of the fmincon is equal to 0.23048, while the optimal cost of quadprog is 3823 (therefore, also the values of the optimal y are very different). In both cases the constraints are satisfied.

Probably I am making something wrong, but I do not understand what.

Any help from you is really appreciated. Thank you in advance

2 Comments
Show NoneHide None

John D'Errico on 25 Jan 2019

Why in the name of god and little green apples do you provide linear equality constraints to fmincon via the nonlinear constraints? Start with that. fmincon accepts linear equality constraints.

Next, you have essentially no lower or upper bounds, since they are set as numerically +/- inf.

John D'Errico on 25 Jan 2019

Edited: John D'Errico on 25 Jan 2019

Open in MATLAB Online

Please don't answer your own question just to provide addtional information. Moved to a comment:

Thank you for your suggestions. You are definitely right. I have changed my code (see below) according to your suggestions. However, the results are exactly the same as before (a part from the increased execution speed, of course).

% Setup values (just as example)
n = 4;
AA = [12, -1, 0, 0; ... 
      0, 0, 111, -1];
bb = [0; 0];
x = [1; 10; 1; 61];
% options for fmincon
optionsFmincon = optimset('LargeScale', 'off', ...
                          'Algorithm', 'interior-point', ...
                          'UseParallel', 'never', ...
                          'MaxFunEvals', 20000, ...
                          'DerivativeCheck', 'off', ...
                          'GradObj', 'on', ...
                          'TolX', 1e-9, ...
                          'TolFun', 1e-9, ...
                          'TolCon', 1e-9, ...
                          'MaxIter', 10000, ...
                          'Display', 'none' ...
                         );
% options for quadprog
optionsQuadprog = optimset('Algorithm', 'interior-point-convex', ...
                           'TolX', 1e-12, ...
                           'TolFun', 1e-12, ...
                           'TolCon', 1e-12, ...
                           'MaxIter', 10000, ...
                           'Display', 'none' ...
                          );
% Fmincon minimization
lowerBound = -Inf*ones(n, 1);
upperBound = +Inf*ones(n, 1);
[out1, Jopt1, exitflag1, output1] = fmincon(@(y) cost(x, y, n), x, [], [], AA, bb, lowerBound, upperBound, [], optionsFmincon);
% Solution with quadprog
H = eye(n);
f = -2*eye(n)*x;
A = [];
b = [];
lb = [];
ub = [];
x0 = [];
[out, Jopt, exitflag, output] = quadprog(H, f, A, b, AA, bb, lb, ub, x0, optionsQuadprog);
% Comparison of results
[out1, out]
[out1'*H*out1-2*x'*eye(n)*out1+x'*eye(n)*x, out'*H*out-2*x'*eye(n)*out+x'*eye(n)*x]
[norm(x-out1)^2, norm(x-out)^2]
[norm(AA*out1-bb), norm(AA*out-bb)]
% cost function for fmincon
function [J, gradient] = cost(x, y, n)
J = (x-y)'*eye(n)*(x-y); % equivalent to J = y'*eye(n)*y - (2*eye(n)*x)'*y; since I can omit x*eye(n)*x
gradient = -2*(x-y);
end

Thus, the issues of my previous question still remain..

Sign in to comment.

Sign in to answer this question.

Answer 1

John D'Errico on 25 Jan 2019

1
Link

Direct link to this answer

https://www.mathworks.com/matlabcentral/answers/441553-quadprog-and-fmincon-provide-very-different-results-on-the-same-problem#answer_358085

Edited: John D'Errico on 26 Jan 2019

Open in MATLAB Online

Why are you doing this? Really, why are you trying to solve it using that kluge of approaches? Go simple. Don't over-complicate things.

n = 4;
AA = [12, -1, 0, 0; ... 
      0, 0, 111, -1];
bb = [0; 0];
x = [1; 10; 1; 61];

Lets see. Minimize norm(x-y), subject to AA*y = bb. Trivial, using the proper tool. In the optimization TB, that is lsqlin. As I show below, my own LSE also gives it directly, found on the file exchange.

Ylsqlin = lsqlin(eye(4),x,[],[],AA,bb)
Minimum found that satisfies the constraints.
Optimization completed because the objective function is non-decreasing in 
feasible directions, to within the default value of the optimality tolerance,
and constraints are satisfied to within the default value of the constraint tolerance.
<stopping criteria details>
Ylsqlin =
         0.834482758578957
          10.0137931029475
         0.549586106124114
          61.0040577797767
          

Verification:

AA*Ylsqlin - bb
ans =
     -1.77635683940025e-15
                         0

The sum of squares of the difference was:

norm(x - Ylsqlin)^2
ans =
         0.230475348269706

Using my own LSE for comparison, found on the file exchange.

lse(eye(4),x,AA,bb)
ans =
          0.83448275862069
          10.0137931034483
         0.549586106151599
          61.0040577828275

Now, how would it be solved using quadprog? The sum of squares of differences is:

(x-y)'*eye(4)*(x-y)

Expanding, we get

x'*x + y'*y - 2*x'*y

x'*x is a constant, so irrelevant to the optimization, BUT it explains why your objectives are so different.

x'*x
ans =
        3823

To convert this objective to a quadprog usable form, we want it as:

FVAL = 0.5*X'*H*X + f'*X.

So

H = 2*eye(4);
F = -2*x';

Now solve.

Yqp = quadprog(H,F,[],[],AA,bb)
Minimum found that satisfies the constraints.
Optimization completed because the objective function is non-decreasing in 
feasible directions, to within the default value of the optimality tolerance,
and constraints are satisfied to within the default value of the constraint tolerance.
<stopping criteria details>
Yqp =
         0.834482758599826
          10.0137931031979
         0.549586106137858
          61.0040577813022

It should be no surprise this is the same as lsqlin, but it took more thought to write. Use the proper tools. And now, you want to use fmincon.

fun = @(y) norm(x-y);
x0 = rand(4,1);
Yfmincon = fmincon(fun,x0,[],[],AA,bb)
Local minimum found that satisfies the constraints.
Optimization completed because the objective function is non-decreasing in 
feasible directions, to within the default value of the optimality tolerance,
and constraints are satisfied to within the default value of the constraint tolerance.
<stopping criteria details>
Yfmincon =
         0.834482752432331
           10.013793029188
         0.549586102080044
          61.0040573308849

So esentially identical. And very easy to write in all cases, though lsqlin was by far the simplest.

Anyway, there was absolutely NO need to provide a gradient function to fmincon. (In fact, there is almost never a need to provide a gradient.) A waste of time certainly in this case. Nor even an m-file cost function. The other differences in your results I would put down to simple algebra mistakes that are hardly worth debugging, because you massively over-complicated the problem.

1 Comment
Show -1 older commentsHide -1 older comments

Mauro on 26 Jan 2019

Edited: Mauro on 26 Jan 2019

Open in MATLAB Online

Thank you for your detailed answer and suggestions. The error in my code was just having written

H = eye(n);

instead of

H = 2*eye(n);

With such a change, I obtain the same results with both fmincon and quadprog.

I agree with you that the problem is very simple and could be solved also by using constrained least squares. Toward this end, I have performed some tests using all the approaches, and I can conclude that lsqlin and quadprog require more or less the same execution time (probably because both use an interior-point method?), while fmincon needs more time (probably because it is a general-purpose function).

The only thing I do not agree in your reply is the fact that providing the gradient of the cost to fmincon is useless. I have performed several tests with and without the gradient (in this example and also in other ones in the past), and I noticed a large reduction of the computational time most of the times (in my code, the minimization is called several times and not only once). Clearly, the gradient must be computed writing an efficient code, otherwise its numerical evaluation is faster.

Anyway, you solved my issues, so thank you very much again for your replies!

Sign in to comment.

quadprog and fmincon provide very different results on the same problem

2 Comments
Show NoneHide None

Accepted Answer

1 Comment
Show -1 older commentsHide -1 older comments

More Answers (0)

See Also

Categories

Tags

Products

Community Treasure Hunt

quadprog and fmincon provide very different results on the same problem

2 Comments Show NoneHide None

Accepted Answer

1 Comment Show -1 older commentsHide -1 older comments

More Answers (0)

See Also

Categories

Tags

Products

Community Treasure Hunt

2 Comments
Show NoneHide None

1 Comment
Show -1 older commentsHide -1 older comments