A long time ago, I wrote a tool that would take an existing spline, and rather than simply evaluate the first or last segments of the spline for extrapolation, I added a new segment that had all desired properties, like monotonicity, concavity, endpoint slope or value constraints, etc. Of course the new segments were fully consistent with the old end points of the spline and the shape at that point. If necessary to meet the specified constraints, I added several segments. The interface for this code was similar to that for SLM, with many possible property/value pairs for any possible shape.
I'll claim that this tool fully met with the SLM philosophy, in that it encouraged the user to explicitly specify information about the shape of the extrapolated curve. While I'd like to provide such a tool, more important in my opinion is to write a GUI wrapper for SLM.
To a large extent you can do that form of extrapolation already, when you first build the curve. Simply specify knots that go out as far as you need the curve to go. The knots need not always be tight up to the end points of the data. (That is the default for SLM, but you can choose your own knots.) This allows you to directly apply any pertinent shape information about that extrapolated region.
thanks for your rapid answer. Your reference to Mark Twain gave me a new inside view about his mathematical abilities :-). You are right, it's a bit philosophical question and of course, extrapolation may results in unexpected results. But my background is numerical/physical motivated. I'm interest in of deconvolution of a given time series. The frequency response = susceptibility in fourier space is known. It's is well known, extending the data set = padding is mandatory to avoid boundary effects like ringing. In case of image deconvolution, used as deblurring see e.g. R. Liu, "REDUCING BOUNDARY ARTIFACTS IN IMAGE DECONVOLUTION" + google. The suggestion in the book Numerical Recipes, Chapter 13.1.1 is zero padding. This is fine is the data set starts and ends with zeros but fails in all other cases. Zero padding leads to strong ringing at the beginning and end of the deconvoluted time series, independently of the padding length. The reason is the discontinuity in the data set before deconvolution. A better idea is the padding with constants to avoid the discontinuity. Next better idea is a extrapolation in the (unphysical) padded region where is no jump in first and second derivative. If i perform the deconvolution with such extrapolation, the ringing artifacts disappears. Of course, after deconvolution, only the time span without the padding regions in front and the end of the data set has a physical interpretation. I hope it explains my physical/numerical motivation of extrapolation with the first and last spline.
BTW: i) I'm a German and my surname is John :-).
ii) I use the slmengine mostly to obtain the numerical derivative of noisy data. From my point of view, the slmengine has many advantages in control of the necessary smoothing of the noisy data set, e.g. concaveup or integral. It's not possible to implement such (physical motivated) features in more sophisticated algorithms like higher order methods or Savitzky-Golay-filters.
You might call it an inconsistency. I choose to call it a strongly held difference of opinion. Really, it all comes down to my philosophy about extrapolation as opposed to that embodied in PPVAL.
I don't let you extrapolate a spline outside of its support using SLMEVAL. Extrapolation does foolish things, just when you least want it to happen. Perhaps my favorite mathematical quote (that hardly anybody else ever seems to know about) is by Mark Twain, from Life on the Mississippi.
“In the space of one hundred and seventy six years the Lower Mississippi has shortened itself two hundred and forty-two miles. That is an average of a trifle over a mile and a third per year. Therefore, any calm person, who is not blind or idiotic, can see that in the Old Oölitic Silurian Period, just a million years ago next November, the Lower Mississippi was upwards of one million three hundred thousand miles long, and stuck out over the Gulf of Mexico like a fishing-pole. And by the same token any person can see that seven hundred and forty-two years from now the Lower Mississippi will be only a mile and three-quarters long, and Cairo [Illinois] and New Orleans will have joined their streets together and be plodding comfortably along under a single mayor and a mutual board of aldermen. There is something fascinating about science. One gets such wholesale returns of conjecture out of such a trifling investment of fact.”
The point is, extrapolation does nasty things, so SLMEVAL does NOT allow you to do anything but extrapolate as a constant function beyond the support of the spline. PPVAL does do so. That is a problem of PPVAL, IMHO.
So what should you do if you truly do NEED to extrapolate? You should have done a better job fitting the spline! Use the capabilities of the SLM tool to fit a spline that goes out as far as you want it to go. Now you have total control over the shape, and you should be monitoring the results to make sure that you get something intelligent, instead of something virtually random as you would get from PPVAL.
To put it another way, if you don't know what the curve should be doing out there beyond the data, or will not choose to deal with controlling the extrapolated shape, then you should not be extrapolating your curve out into those nether regions. Again, this is my opinion, but it seems a very logical one, and it is one that is fully consistent with the philosophies of SLM.
Dear John, excellent tool, I use it extensively in my daily work with noisy data. A reference in the next paper will be given.
May be, i found a tiny inconsistency regarding the extrapolation of data set. Have a look on this sample code:
% aim: find a good extrapolation of noisy data, green line in final plot
clear all;close all;
% matlab interp1 is not designed for noisy data, fails
% first try with slm, only the first and last data point are padded in extrapolation
figure(1);hold on;plot(xinterp,yslmeval,'-r');ylim([-3 3]);
% second try, this works, calulation of the polynomial in the extrapolated regions
figure(1);hold on;plot(xinterp,yppeval,'-g');ylim([-3 3]);
Thank you, John. According to your suggestion, I have fitted my data again, but unfortunately, the results is worse than seperately fitting the two parts of the curve. I made a simple simulation, but I still got the similar result. Would you please take a few minutes to have a look at my simulation data, I will send it to you by email. Thank you.
I found the code written by Meyer on this web site "http://www.stat.colostate.edu/~meyer/srrs.htm", but it was written in R code, not matlab. I think I should learn R code these days. Meyer also did not give out how to fit the data with half convex and half concave, it seems still a long way for me to get out. Would you please give me some suggestion? Thank you.
I am looking forward to your new reply, Thank you again.