How to remove February 29th for leap years in a daily time series over 43 years?
Show older comments
I have an array of values (x) representing daily data for 43 years (1979-2021) where size(x)=15706
I rearranged the data to have a matrix with years as rows and days as columns.
% some x --> keep only multiple of 365 days (removes last 11 days)
x = x(1:365*43).';
% reshape x to have years as rows and days as columns
x_matrix = reshape(x,365,[]).'
However, this is problematic for leap years since now my columns in x_matrix are not aligned with a given day. Indeed, x_matrix(year n, 1) is not always January 1 but is shifted toward later in January as years pass.
My data represents year 1979-2021 where 1980, 1984, 1988, 1992, 1996, 2000, 2004, 2008, 2012, 2016, 2020 are leap years.
How can I remove the last day of February for each leap year to have x_matrix align for each year with a given day where x_matrix(year n, 1) is always January 1?
Is there a useful function for that somewhere?
Thank you
5 Comments
You could, but why create an artificial structure that doesn't represent reality and throws away perfectly good data?
Create/use a timetable instead and you can then process/access the data however you choose with grouping variables one of which can easily be the month, day, etc., and then having to access a day by indexing into a column becomes a thing of the past. There's also a way to reference the day of the week instead of numeric day of month if that's another way want to look at the data.
IOW, keep all the info, just slice 'n dice as needed from the data values themselves, not by an artificial rearrangement.
You may want to look into "splitapply workflow" and grouping variables and in particular with the timetable class the retime and rowfun functions. They're amazingly powerful at such data to answer all kinds of questions in very compact code. Virtually all the addressing issues just melt away...
As for the Q? actually asked, although it's not the approach I would recommend, my utility function isleapyr may be of value (although I hope you'll not use it as you've outlined above).
function is=isleapyr(yr)
% returns T for input year being a leapyear
% dpozarth -- from dark ages; modified to handle new datetime class
if isdatetime(yr), yr=year(yr); end
is=(eomday(yr,2)==29);
end
A LL
on 1 Jun 2022
dpb
on 1 Jun 2022
OK, that's one reason I hadn't thought of for a rectangular array...if you don't want to regularize with missing data for the non-leap years, then simply
t=datetime(1979,1,1):days(1):datetime(2021,12,31);
isNotLeapDay=~(month(t)==2 & day(t)==29);
data=data(isNotLeapDay);
That's assuming a vector of the data before trying to reshape into an array.
I think I'd still keep the full data set and just extract as above for the special purpose...
A LL
on 2 Jun 2022
Accepted Answer
More Answers (1)
James Tursa
on 1 Jun 2022
Edited: James Tursa
on 1 Jun 2022
To get rid of the leap days, you can use evenly spaced indexing since the number of days between leap days is constant for your time span. Since index 1 corresponds to Jan 1, 1979, that means the first leap day is Feb 29, 1980 which is index 365+31+29 = 425. The next leap day will be 4*365+1 = 1461 days later. So the code to get rid of the leap days for your data is (do this prior to reshaping):
x(425:1461:end) = [];
Categories
Find more on Data Type Identification in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!