How could I sharpen the parameters of my model?

3 views (last 30 days)
I want to fit my data to a piecewise model, which has four parameters:
function y = mmpkf2(x,T0,K1,K2,X0)
if x < X0
y = T0 + K1.*(x./(1-x));
else
y = T0 + K1.*(X0./(1-X0)) + K2.*(x-X0);
end
and I have tried to model the data with this script
clc;
clear all;
close all;
M = csvread('Matka-aikadata.txt');
xdata = M(:,1);
ydata = M(:,2);
ft = fittype('mmpkf2(x,T0,K1,K2,X0)','coefficients',{'T0', 'K1', 'K2', 'X0'} );
f = fit( xdata, ydata, ft, 'StartPoint', [42, 21, 60, 0.5], 'Lower', [-Inf,-Inf,-Inf,-Inf], 'Upper', [Inf,Inf,Inf,Inf], 'MaxIter', 500000);
plot( f, xdata, ydata );
with poor success (image attached to this message). How should I refine my script for better fit?
Thanks,
-Anton
  4 Comments
John D'Errico
John D'Errico on 14 Sep 2018
Edited: John D'Errico on 14 Sep 2018
So it is indeed homework. And I won't do your homework.
I might question why your teacher wants to fit that model to this messy data. So lets look at the model. THINK ABOUT WHAT YOU HAVE BEEN GIVEN! Look at it.
Below X0, which is one of the unknowns, so the breakpoint in the model, your curve is an arc from a hyperbola.
Above that point, it is a straight line.
Really? If I were to look at that plot, I see a function that does pretty much nothing on the bottom half, and it starts to take off at the top end.
Were I to guess, I might consider whether the inequality was specified in the wrong direction, thus above X0, the curve should be an arc of a hyperbola.
In fact, if you look at your data, A hyperbolic arc would seem to fit the entire curve pretty well.
fun = @(x,T0,K1) T0 + K1.*(x./(1-x));
ezplot(@(x) fun(x,1,2))
grid on
Note that the left branch of the hyperbola looks much like your curve. The singularity is at x==1 for that hyperbolic arc, so I'd just use a model that allows the point of singularity to shift. That might be:
y = T0 + K1.*(x./(Xs-x));
so with parameters T0, K1, and Xs.
Solving for a breakpoint on such noisy data will be almost impossible to do well. But hey, what do I know?
ft = fittype('T0+K1.*(x./(Xs-x))','indep', ...
'x','coeff', {'T0','K1','Xs'})
mdl = fit(x,y,ft,'Start',[100,10,2])
mdl =
General model:
mdl(x) = T0+K1.*(x./(Xs-x))
Coefficients (with 95% confidence bounds):
T0 = 98.63 (98.26, 99)
K1 = 10.96 (10.16, 11.76)
Xs = 1.24 (1.22, 1.261)
plot(x,y,'y.')
hold on
grid on
plot(mdl,'b')
Essentially, you can get a very nice fit, merely by throwing away that useless linear term, thus scrap the piecewise fit completely. That upper linear part is a complete waste of time.
But hey, if your teacher insists on using that exact model, well teachers do silly things all the time. It is homework after all.
Personally, I think it more important that you understand what the parameters of a model mean in context. I think an important point is understanding that breakpoints between models can be difficult to fit, and almost impossible when the separate terms are actually inconsistent with the data you have. But again, hey what do I know? ;-) (Yes, I know this response is not terribly useful in showing you how to solve your homework problem. But arguably, the model you show is literally impossible to fit in any meaningful way to that data.)
Sigh. I suppose there is some tiny amount of lack of fit at the lower end. That suggests any linear segment would be best used for x<X0, NOT above that point.
Anton Ranta
Anton Ranta on 15 Sep 2018
Thanks for your comments.
After seeing the seemingly continuous data, I was also confused as why the model is given in piecewise form.
I double-checked the direction inequality operators, and they really are given as i wrote it (hyperbola -> straight line), which is very confusing.
I'll give some feedback to my teacher about this, but again, thanks for your comments. They helped a lot.

Sign in to comment.

Answers (1)

John D'Errico
John D'Errico on 15 Sep 2018
Edited: John D'Errico on 15 Sep 2018
(I suppose, as long as I am going to spend this much time on a question, I might as well put it in an answer.)
Let me expand a bit more, because this touches on the question of model building, choice of models, empirical models, etc.
When you build a model for some data, there are several ways you might choose that model. In fact, I'll argue there is an entire spectrum of model choices that I see used. I break that spectrum into a set of related families.
1. You have knowledge of the system that underlies the process whereby your data was created. For example, the growth of bacteria in a petri dish (or humans on planet earth, for that matter.) A very good model for such a process is found in differential equations, the solution to which is an exponential model. As long as the system does not undergo a famine/disease related crash, or anything that changes the birth/death rates, then that exponential model will do handsomely well. These physical models are the substance of many models built. Other examples might be the rotation of the moon around the earth, where that path takes the form of an ellipse. So we might presume to estimate the shape of that ellipse, which will probably be pretty decent as a model, subject to various other facts, such as the presence of the sun, Jupiter, etc. Once you other bodies into the two body problem, the path will no longer be quite as simply predicted. For that matter, we might need to think about how relativity impacts the traditional Keplerian model. A model is just a model, of course.
2. A related class of models are what I call metaphorical models. Here, the system under study is a process that logically acts like another process where we have a good physical model. One that I really like, learned many years ago is that of product sales, encouraged by word of mouth. It turns out that such sales models can be modeled by models that are used to fit the spread of an infectious disease. Think of the flu. You get it by being near someone else who has the flu. They cough, sneeze, whatever. Spreads germs through the air. Or you touch something they just touched, then somehow the bug gets inside of you. You get the flu, and spread it to several other people, just as the source did. If you think about it, such disease propagation models can work very nicely to also predict product sales growth. This is a very common class of model that people use. They do not always recognize it of course, that that is what they are doing. Some of those metaphorical models are not easily recognized as such. Splines are a good example here.
3. A cubic spline can be derived as the shape of a thin flexible beam, constrained to pass through a set of points. A little work in physics/mechanics will show that the shape of that beam can be well modeled by a differential equation. The solution to that differential equation is a piecewise cubic polynomial. In fact, the name "natural" cubic spline is related to the natural boundary conditions that are found in the calculus of variations solution where we derived the differential equations for the beam shape.
But people use cubic splines for all sorts of things, because splines fit many classes of data well, NOT because those data sets arise from the shape of a physical beam. So while splines do form a class of metaphorical model, they represent one where the metaphor has been stretched to a limit.
One must be careful of course, since a metaphor is only an approximation. While a nice thing about a metaphor is it allows you to make predictions about the system, those predictions can fail. The example I like to use comes from a guy named Bill. He once wrote these lines...
But soft, what light through yonder window breaks?
It is the east, and Juliet is the sun.
Arise, fair sun, and kill the envious moon,
Who is already sick and pale with grief
That thou, her maid, art far more fair than she.
(Romeo & Juliet, William Shakespeare)
We see in there a metaphor, wherein Juliet is compared to the sun. From that, we can learn of Romeo's feelings toward her, but metaphors can go too far. Is her surface temperature around 5000 degrees C? So, within limits, we might use a model to predict some behavior of a process, but sometimes the metaphor will break down.
A good example is in the use of splines to predict the shape of a film response curve. We used these things all the time at the large photographic company that once employed me. Shine light on film, and it becomes exposed. More light, and more exposure happens. So the film gets darker (or lighter, depending on the specific chemical process involved.)
Now, typically these curves are monotonic things. More photons, and the paper/film monotonically increases its response. The problem is that some of these response curves have a very sharp toe or shoulder. So you can see a rapid transition from doing nothing, to a monotonic increase, and eventually, the media maxes its response out. In the case of splines however, they are models of a beam. And beams can exhibit oscillatory behavior when you try to bend them sharply. Think of Gibb's phenomena.
x = -10:10;
y = double(x>=0);
xhat = -10:.01:10;
yhat = interp1(x,y,xhat,'spline');
plot(xhat,yhat,'r-')
hold on
grid on
plot(x,y,'bo')
The point is that not all metaphors work perfectly, and sometimes they can introduce metaphorical baggage into their approximation of the system under study.
4. Next, we see models chosen PURELY on the basis of their fundamental shape. A classic example is the Gaussian curve.
ezplot(@(x) exp(-x.^2))
It seems every time someone sees a process with a bump in it, they decide to fit it with a Gaussian model. (Ok, the law of large numbers does imply these Gaussian curves in all sorts of unexpected places, so they may have some metaphorical logic in their thinking after all.)
But much of the time I find people wanting to fit a model just because they know a curve has vaguely the same shape as what they see, and for no better reason.
5. Piecewise models. People see that splines can work well, functions that are actually different curve segments pieced together into a continuous, differentiable whole. So why not apply the same idea to nonlinear segments?
The idea here is that the brain is very good at recognizing shapes, patterns, etc. So we might see that a part of the curve would be well fit by one general curve shape, but that shape does not apply to the entire curve. So we introduce piecewise models.
In some cases, the piecewise nature actually does apply, since we may know that some behavior suddenly shuts off, switching to another fundamental behavior.
A serious problem with piecewise models is the difficulty in estimating the break points. Especially when data is noisy, those breaks can be virtually impossible to locate accurately.
6. Finally there are polynomial models, as well as a variety of various series based models. We learn about these animals early in our careers, usually from things like Taylor series. In school we learn that pretty much everything can be approximated by a truncated Taylor series, or so it seems. As well, there are Fourier series, Bessel function series, series based on a variety of orthogonal polynomials, etc. All of these families of functions are often appropriate to specific problems. And so we learn to fit such model families to data. And in fact, a linear model is often a very good curve shape to use. It is well behaved in extrapolation, monotonic, etc.
Each class of model as I have described it above is appropriate SOME of the time, sometimes more so than others. In your case, you have been given instructions to use a piecewise model to fit data. But your data is noisy as hell, so much so that there is little capability to locate the break location. As well, the linear segment in that model is hardly justified by the data, because, as I showed, you can fit the entire data set quite well using only a single hyperbolic arc.
The above represents what is essentially the introduction to classes I have taught in the art of curve fitting, approximation and modeling.
What advice would I offer to you in this case, were I consulting on the problem? I would suggest that typically when someone poses a model with parameters that have no real meaning to them, break points they cannot estimate anyway, that often in those cases, they should be using a different class of model.
Here for example, the piecewise model seems to make no real sense, and the magnitude of the noise suggests that the parameters would have little value even if you did fit that curve.
So let me start with the model I first posed for your curve.
I see some lack of fit near the bottom end, and when I plot the residual errors, I see they are a little biased above zero for small x.
plot(x,y - mdl(x),'.')
grid on
That suggests a different model might be slightly appropriate.
ft1 = fittype('T0 + K1*x + K2.*(x./(Xs-x))','indep','x','coeff',{'T0','K1','K2','Xs'});
mdl1 = fit(x,y,ft1,'Start',[100,10, 10, 2])
mdl1 =
General model:
mdl(x) = T0+K1*x + K2.*(x./(Xs-x))
Coefficients (with 95% confidence bounds):
T0 = 100.7 (100, 101.3)
K1 = -25.69 (-35.38, -16)
K2 = 35.47 (22.4, 48.53)
Xs = 1.526 (1.398, 1.655)
plot(x,y,'b.')
hold on
h = plot(mdl1);
h.LineWidth = 3;
So, by adding in a linear term, the lack of fit has gone down significantly. I'd probably just stop there, unless I had some reason to go further.
Anyway, by this point, usually I would just tell people to use a least squares spline tool, like my SLM toolbox, as found on the file exchange. In fact, it yields a comparably good fit. Rather than waste network bandwidth on another plot of that fit, I'll just plot the first derivative of the estimated spline.
Hmm. This seems to suggest a good fit arises from a curve that is indeed linear above x=0.85 or so. In turn, that suggests that your data may well have been artificially generated by your teacher, as a piecewise function, with a break near 0.85. Sigh.
Can we return to your model? At least I have decent estimates of the parameters now to start, based on our previous fit, and based on the spline derivative plot.
The break is close to x=0.85. The slope above that point is roughly 105.
ft2 = fittype('T0 + K1.*min(x,X0)./(1-min(x,X0)) + K2*max(x-X0,0)' ...
,'indep','x','coeff',{'T0','K1','K2','X0'});
mdl2 = fit(x,y,ft2,'Start',[100, 10, 105, 0.85])
mdl2 =
General model:
mdl(x) = T0 + K1.*min(x,X0)./(1-min(x,X0)) + K2*max(x-X0,0)
Coefficients (with 95% confidence bounds):
T0 = 100 (99.71, 100.4)
K1 = 4.921 (4.672, 5.17)
K2 = 102.8 (96.13, 109.5)
X0 = 0.7813 (-38.55, 40.12)
Ok, so how well did we do? The fit is reasonable in either case. But, look at the confidence intervals it finds around the break point: 0.7813, but bounds of -38.55 and 40.12. Effectively, that says the parameter is so poorly estimated, that it is virtually useless.
std(y - mdl1(x))
ans =
10.05
std(y - mdl2(x))
ans =
10.033
In fact, I'd bet that your teacher may have created this data by adding normally generated random noise with a standard deviation of 10 to a values from a curve.
But there is effectively no difference in the quality of fit, in terms of the residual error. Chasing tiny differences like this is a good recipe for insanity. :)
Enough for now.

Categories

Find more on Get Started with Curve Fitting Toolbox in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!