How to improve the fitting result

1 view (last 30 days)
Hellow, there!
I am doing a fitting using quite involved custom equation. The definition of the function is as below:
function bif = bifit(A,bix,a,x,e,k,T)
fun = @(bix,b,a,x,e,k,T) exp(-b.^2./(2.*a.^2)).*heaviside(bix-x-b-e).*exp(-(bix-x-b-e)./(k.*T));
bif = integral(@(b) fun(bix,b,a,x,e,k,T),-1,1,'ArrayValued',1).*A;
end
Also, the code for fitting is as below:
ft = fittype('bifit(A,bix,a,x,e,k,T)','problem',{'x','k','T'},'independent','e'...
,'coefficients',{'A','bix','a'});
fo = fitoptions(ft);
fo.StartPoint = [129400 5.98470 0.01219];
fo.Lower = [120000 5.90000 0.00800];
fo.Upper = [300000 6.10000 0.13000];
bi = fit(data_x,data_y,ft,fo,'problem',{3.0169,8.61773*10^-5,161})
bi_coeff = coeffvalues(bi)
plot(bi,data_x,data_y,'-'),legend('exp','bi','location','Northeast')
'data_x' & 'data_y'(429x1 double respectively) are both column vector type data imported from excel file.
'e'(429x1 double) is a number type array(or matrix) data also imported from excel file.
The problem is that the result of the fitting returns the plot as shown below, which clearly does not fit the data:
The fitting should be as shown below:
The coeffvalues() returns the coefficient values as below, but it should be 129406 5.98480 0.01220 for the fit to look like the plot right above.
bi_coeff =
1.0e+05 *
1.2018 0.0001 0.0000
Thus, I was thinking to get a better fitting, I might need to extend the significant digit, or to bounds the x-axis value.
Any advice to improve the result of the fitting? Thanks a lot in advance!
  1 Comment
Wonkyung Choi
Wonkyung Choi on 10 Feb 2021
I've discarded the data points that is less relevent to the fitting by hand from excel file.
Its plot looks like above. However, it is still less accurate and also has the problem that I have to discard the data point manually for each set of data. I am looking for more automatic and efficient way to fit. Any help would be deeply appreciated!

Sign in to comment.

Accepted Answer

Walter Roberson
Walter Roberson on 10 Feb 2021
Edited: Walter Roberson on 10 Feb 2021
My tests indicate that over that range of constraints, unless data is discarded, then your best fit will always be with A = 120000 (the lower boundary of A).
I am getting comparatively small differences in fit around that boundary though. Basically all the minima I am finding are straight lines with slightly different offsets and angles.
Correction: I got a 0.05% improvement (2 parts in 10000) at 127570, 6.08408, 0.0780185. Was still a straight line.
  2 Comments
Walter Roberson
Walter Roberson on 10 Feb 2021
format long g
data = readmatrix('t=70.xlsx');
data_x = data(:,1);
data_y = data(:,2);
e = readmatrix('e t=70.xlsx');
x = 3.0169; k = 8.61773*10^-5; T = 161;
[sx, sxidx] = sort(data_x);
sy = data_y(sxidx);
%function bif = bifit(A,bix,a,x,e,k,T)
obj = @(Abixa) bifit(Abixa(1), Abixa(2), Abixa(3), x, sx, k, T);
fo.StartPoint = [129400 5.98470 0.01219];
fo.Lower = [120000 5.90000 0.00800];
fo.Upper = [300000 6.10000 0.13000];
residue = @(Abixa) sum((obj(Abixa) - sy).^2);
options = optimoptions('fmincon');
options.Display = 'iter';
Samp = 100;
StartPoint = [fo.StartPoint; rand(Samp,3).*(fo.Upper - fo.Lower) + fo.Lower];
fprintf('Seeding with %d locations, please wait.\n', size(StartPoint,1));
trial_residues = arrayfun(@(row) residue(StartPoint(row,:)), (1:size(StartPoint,1)).');
[bestresidue, bestidx] = min(trial_residues);
bestStartPoint = StartPoint(bestidx,:);
fprintf('random selection had residue %g at point [%g,%g,%g]\n', bestresidue, bestStartPoint);
fprintf('Starting fmincon from best seed, please wait\n');
[best_Abixa, fval] = fmincon(residue, bestStartPoint, [], [], [], [], fo.Lower, fo.Upper, [], options);
fprintf('fmincon from there had residue %g at point [%g,%g,%g]\n', fval, best_Abixa);
ypred = obj(best_Abixa);
plot(sx, sy, 'k', sx, ypred, 'b');
legend('exp','bi','location','Northeast')
Wonkyung Choi
Wonkyung Choi on 15 Feb 2021
Ohh, I see. Thank you so much for examining the fit bounds.
I should do the fitting by hand in this case, thenXP. Again I really appreciate your help!

Sign in to comment.

More Answers (1)

Matt J
Matt J on 10 Feb 2021
Discard the data corresponding to data_x>2.98. Clearly, you do not want that to participate in the fit.
  2 Comments
Wonkyung Choi
Wonkyung Choi on 10 Feb 2021
Edited: Wonkyung Choi on 10 Feb 2021
I know it's a kind of silly quesiton, but what kind of func should I use if I want to discard part of the data? It's because I also want to plot the data which I do not need to fit as shown above. Or are you suggesting that I discard the data in the excel from the first place(and not within the code)? I have lots of data to fit, so if I discard it in the excel from the first place, then it's too cumbersome.(plot the data, read out the line where I should discard the data, and then run the code for every single set of data.)
Matt J
Matt J on 10 Feb 2021
Discard the data from the fit only
keep=data_x<2.98;
bi = fit(data_x(keep),data_y(keep),ft,fo,'problem',{3.0169,8.61773*10^-5,161});
plot(bi,data_x,data_y,'-'),legend('exp','bi','location','Northeast')

Sign in to comment.

Categories

Find more on Elementary Math in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!