SOLVED: Summing values together until certain value is reached - then repeat - and plot

Let's get right into it:
I have yearly data for water level increase in my water tank. I need to find out that on which days the water level reaches tank capacity (lets say 500 liters), then plotting these days on the same graph as the 'water per day'. When the maximum capacity is reached (= tank is emptied), the "countdown" should then start again, and so on.
Experimenting with cumsum has gotten me nowhere and I have not managed to create a sufficient loop to do this.
% Time of year % Water per day [liters]
'01-Jan-2018 00:00:00' 50.2106332550196
'02-Jan-2018 00:00:00' 69.1231162561486
'03-Jan-2018 00:00:00' 39.2133461192996
'04-Jan-2018 00:00:00' 35.1911633369915
'05-Jan-2018 00:00:00' 70.7909634228800
'06-Jan-2018 00:00:00' 45.2770948623210
'07-Jan-2018 00:00:00' 103.742125124495
'08-Jan-2018 00:00:00' 104.090498552586
'09-Jan-2018 00:00:00' 49.9451044989908
'10-Jan-2018 00:00:00' 9.75987421081660
'11-Jan-2018 00:00:00' 8.65371484802794
'12-Jan-2018 00:00:00' 2.47031260215833
'13-Jan-2018 00:00:00' 28.3093648636441
'14-Jan-2018 00:00:00' 113.381792987146
I've tried scrolling through the forums and stackexchange but no luck so far. I found the following answer to a similar problem, but the thread was 6 years old and I did not get it to work in MATLAB 2019b (with the $ and everything).
df$difference_sum <- c(0, head(cumsum(df$difference), -1))
# get length of 0's (first keep value gives the actual length)
len <- sum(df$difference_sum %/% 500 == 0)
df$keep <- (seq_len(nrow(df))-1) %/% len
df <- transform(df, difference_sum = ave(difference, keep,
FUN=function(x) c(0, head(cumsum(x), -1))))
I also tried to following loop, to no avail.
A= H2O_perday;
sum1=0;
k=1;
s = [];
while sum1 < 500
s =sum1;
sum1=sum1+A(k);
k=k+1;
end
disp(s);
disp(k-1);
Does anyone here have experience with a similar issue that could help me or show me some tips on how to proceed in this?

 Accepted Answer

OK, a working script...I set the setpoint to 250 so would have more than one condition satisfied for testing...
tt.Level=zeros(height(tt),1);
C = 250;
tt.Level=cumsum(tt.Fill);
i1 = 1;
i2 = find(tt.Level>=C,1);
while i2 <= height(tt)
i1=i2+1;
%tt.Level(i1:end) = tt.Level(i1:end) - tt.Level(i1);
tt.Level(i1:end)=cumsum(tt.Fill(i1:end));
i2=find(cumsum(tt.Fill(i1:end))>=C,1) + i1-1;
if isempty(i2), break, end
end
produced:
>> tt
tt =
14×2 timetable
Time Fill Level
___________ ______ ______
01-Jan-2018 50.21 50.21
02-Jan-2018 69.12 119.33
03-Jan-2018 39.21 158.55
04-Jan-2018 35.19 193.74
05-Jan-2018 70.79 264.53
06-Jan-2018 45.28 45.28
07-Jan-2018 103.74 149.02
08-Jan-2018 104.09 253.11
09-Jan-2018 49.95 49.95
10-Jan-2018 9.76 59.70
11-Jan-2018 8.65 68.36
12-Jan-2018 2.47 70.83
13-Jan-2018 28.31 99.14
14-Jan-2018 113.38 212.52
>>

1 Comment

Perfect! It works like a charm also with the hourly data I have stored. Thank you for all your help and I wish you a good rest of the week!

Sign in to comment.

More Answers (1)

Not fully coded, but the idea should work to build loop...
t=readtable('tarmo.dat','HeaderLines',1);
t.Var1=strrep(t.Var1,'''',''); % remove extra ' in the date strings
t.Var1=datetime(t.Var1); % convert to date
t=timetable(t.Var3,'rowtimes',t.Var1); % make into time table
t.Properties.VariableNames={'DailyFill (l)'};
C=500; % Capacity level
i1=1;
i2=find(cumsum(t.("DailyFill (l)"))>=C,1); % First location exceed capacity
while i2<=height(t) % iterate
i1=i2+1; % next start
t.Level(i1:end)=t.Level(i1:end)-tt.Level(i1); % reset
t.(Level,i1:end)=cumsum(t.("DailyFill (l)")(i1:end)); % new cumulative flow (fill)
i2=find(cumsum(t.("DailyFill (l)")(i:end)>=C,1)+i1; % First location exceed capacity beginning next
end
WARNING: Air code, only minimally tested for first/second cases sequentially at command line; not in loop. Debugging undoubtedly required!

10 Comments

Thank you for a quick response, but I am having trouble with your code.
How should I act, when I already have an existing 365x3 timetable where the 'X3_daily' is my above mentioned 'Water per day'? X1_daily, X2_daily and X3_daily are all 365x1 doubles.
TT_bat_KG = timetable(timeD',X1_daily,X2_daily,X3_daily) ;
timeD is a 1x365 datetime row, transformed here into a column.
I tried these edits, but the following error message occurs "Dot indexing is not supported for variables of this type."
t=TT_bat_KG.X3_daily;
% t=X3_daily; - same error message
t.Var1=strrep(t.Var1,'''',''); % remove extra ' in the date strings
t.Var1=datetime(t.Var1); % convert to date
...
I also tried to circumvate this by:
writematrix(X3_daily,'tarmo.dat') % writetable does not support 'double' so used writematrix
t = readtable('tarmo.dat','HeadLines',1) % readmatrix causes same 'Dot indexing is not..' error
t=strrep(t,'''','');
But this caused two errors, one being that for some reason t is now 363x1 table, instead of 365x1, and giving a message 'Error using strrep - conversion from table to double is not possible'.
Sorry if these questions sound silly, but could you help me solve these issues?
  1. If you already have the timetable, use it instead of creating another. Use the appropriate variable(s) in it instead of my names.
  2. When I just copied your sample data into a file, readtable returned the date column as a string for which the strrep code works. If you already have a valid datetime, then you don't have the problem of the extra tic mark one gets if copy/paste the data from original post (so don't need to correct for what isn't there).
Okay this is where I've gotten but I have a new error message that I havent been able to solve with google.
C = 500; % Capacity level
i1 = 1;
i2 = find(cumsum(t.("Total[kg/d]"))>=C,1);
while i2 <= height(t)
i1 = i2+1;
t.Level(i1:end) = t.Level(i1:end) - t.Level(i1);
t.Level(i1:end) = cumsum((t.("Total[kg/d]")) - (i1:end));
i2 = find(cumsum(t.("Total[kg/d]"), i1:end) >= C,1) + (i1);
end
The error message states "The end operator must be used within an array index expression."
Now also, I am not sure how to create this t.Level vector into an already existing timetable. If Im right, t.Level should be the vector where the results are saved, no?
Thank you for your help so far!
  1. Oh. I cut and pasted from commandhistory of what had experimented with on command line to build the code snippet. Inadvertently delefed the line that created the Level variable. Before the loop begins, add:
t.Level=zeros(height(t),1);
Variables are automagically appended to a table or timetable on reference to a new variable name.
2. You messed up the syntax in the cumsum() line--you've got a minus operator between the first (row) indexing expression and the second (column). The line I had was:
t.(Level,i1:end)=cumsum(t.("DailyFill (l)")(i1:end)); % new cumulative flow (fill)
NB: there are two sets of parens enclosed by the parens that enclose the argument to cumsum that taken together are the combined indexing expression for the column of fill rate data and the row subscript within that array.
Play around at the command line with the table and look at the examples on addressing table data elements in the documentation for tables. It's sorta' convoluted so you really need to study the doc and the examples and then just play at it until you finally see the pattern.
The following line as you provided gives me the error "Invalid expression. When calling a function or indexing a variable, use parentheses. Otherwise, check for mismatched delimiters."
t.(Level,i1:end) = cumsum(t.("Total[kg/d]")(i1:end));
Now switching it to
t.(Level(i1:end)) = ...
Errors the same "The end operator must be used within an array index expression."
The code I posted above (t.Level(i1:end)) (without the minus operator you mentioned) works to a certain extent. It does fill the values in t.Level with something, but it doesn't seem to be the correct ones. Here's an example:
% Week # 1, cumsum doesnt work
t.("Total[kg/d]") t.Level
'01-Jan-2018 00:00:00' 50.2106332550196 0
'02-Jan-2018 00:00:00' 70.4280059515844 0
'03-Jan-2018 00:00:00' 39.8507080856674 0
'04-Jan-2018 00:00:00' 35.6173469297880 0
'05-Jan-2018 00:00:00' 70.7730355982859 0
'06-Jan-2018 00:00:00' 45.2770948623210 0
Then the cumulative sum starts working on 10th of January, but it exceeds the limit (500) two times, before resetting and starting again. The limit is also exceeded later in the year.
% Week # 2, cumsum starts working
'10-Jan-2018 00:00:00' 10.1218998731437 10.1218998731437
'11-Jan-2018 00:00:00' 8.65899650655459 18.7808963796983
'12-Jan-2018 00:00:00' 3.34140798514306 22.1223043648413
'13-Jan-2018 00:00:00' 28.3093648636441 50.4316692284854
'14-Jan-2018 00:00:00' 94.9403458146407 145.372015043126
'15-Jan-2018 00:00:00' 96.8458584802590 242.217873523385
'16-Jan-2018 00:00:00' 95.7154391552638 337.933312678649
'17-Jan-2018 00:00:00' 67.0081632500268 404.941475928676
'18-Jan-2018 00:00:00' 30.2795717950927 435.221047723769
'19-Jan-2018 00:00:00' 27.5242620958731 462.745309819642
'20-Jan-2018 00:00:00' 79.4413207315795 542.186630551221 % Value exceeded
'21-Jan-2018 00:00:00' 12.2946983208011 554.481328872022 % Value exceeded
'22-Jan-2018 00:00:00' 3.05851109465591 3.05851109465591
I tried changing the cumsum line from "..>= C,1.." into "..<=C,1.." but this gives me an error "Index exceeds the number of array elements (365) & Error in tabular/dotParenReference, b = b(rowIndiced)" so that doesnt seem to be the solution.
With "..==C,1.." the cumsum again stops working, and all values in t.Level will be 0.
Perhaps it is possible that when cumsum reaches <= 500, it will just set t.Level = 1? Then all other days would be 0 except the days when I have to empty the tank?
Thank you for everything so far and I will continue to try to solve this.
OK, I flubbed...as noted, I created the posted code by pasting the whole commandhistory of the interactive session and then tried to edit it to create the looping construct outline and saved the wrong line...the cumsum calculation does have to reference the subscript of the array to use the keyword end and the above is a table reference. Sorry, I'm very tied up w/ some other time-critical tasks and didn't have the time to look carefully.
The command window sequence I used to think through the logic first is as follows:
>> tt.Level=cumsum(tt.("Daily Fill (l)")) % fill the Level with the cumulative fill totals
tt =
14×2 timetable
Time Daily Fill (l) Level
___________ ______________ ______
01-Jan-2018 50.21 50.21
02-Jan-2018 69.12 119.33
03-Jan-2018 39.21 158.55
04-Jan-2018 35.19 193.74
05-Jan-2018 70.79 264.53
06-Jan-2018 45.28 309.81
07-Jan-2018 103.74 413.55
08-Jan-2018 104.09 517.64
09-Jan-2018 49.95 567.58
10-Jan-2018 9.76 577.34
11-Jan-2018 8.65 586.00
12-Jan-2018 2.47 588.47
13-Jan-2018 28.31 616.78
14-Jan-2018 113.38 730.16
>> i1=1;i2=find(tt.Level>=C,1); % initial loop counters beginning, first location > C
>> i1=i2+1; % beginning loop counter next iteration
>> tt.Level(i1:end)=tt.Level(i1:end)-tt.Level(i1) % reset from next point to end (back to zero'ed cumsum level)
tt =
14×2 timetable
Time Daily Fill (l) Level
___________ ______________ ______
01-Jan-2018 50.21 50.21
02-Jan-2018 69.12 119.33
03-Jan-2018 39.21 158.55
04-Jan-2018 35.19 193.74
05-Jan-2018 70.79 264.53
06-Jan-2018 45.28 309.81
07-Jan-2018 103.74 413.55
08-Jan-2018 104.09 517.64
09-Jan-2018 49.95 0.00
10-Jan-2018 9.76 9.76
11-Jan-2018 8.65 18.41
12-Jan-2018 2.47 20.88
13-Jan-2018 28.31 49.19
14-Jan-2018 113.38 162.58
>>
Above is for illustration of intermediate result...what you do in the loop is reset the new level for those locations, again not worrying about the total level exceeding the limit at this point:
>> tt.Level(i1:end)=cumsum(tt.("Daily Fill (l)")(i1:end))
tt =
14×2 timetable
Time Daily Fill (l) Level
___________ ______________ ______
01-Jan-2018 50.21 50.21
02-Jan-2018 69.12 119.33
03-Jan-2018 39.21 158.55
04-Jan-2018 35.19 193.74
05-Jan-2018 70.79 264.53
06-Jan-2018 45.28 309.81
07-Jan-2018 103.74 413.55
08-Jan-2018 104.09 517.64
09-Jan-2018 49.95 49.95
10-Jan-2018 9.76 59.70
11-Jan-2018 8.65 68.36
12-Jan-2018 2.47 70.83
13-Jan-2018 28.31 99.14
14-Jan-2018 113.38 212.52
>>
Then you repeat the same lookup as before:
>> i2=find(tt.Level(i1:end))>=C,1)+i1 % now find the next end point; compensate to account for start location offset
i2 =
0×1 empty double column vector
>>
That's the general idea -- start with the cumulative totals from the beginning; don't worry about the setpoint.
Then, find that first break location and reset from the next point to the end; the loop indices are incremented to begin at the next entry after the found location and the end point recomputed. The offset of the total at the previous end point is subtracted from the first location of the next pass to zero it out
As noted, you don't have enough data to reach the second limit so that another break loop is that isempty(i2) is true.
The three lines of the two indices and the resetting of the Level array line are all needed to write a loop construct to reproduce the above result for whatever size table you start with.
No need to apologize for anything, if you need to be elsewhere completely understandable! Thank you so much for the help you have provided.
With your latest help the code seems to be working as intended minus one question. I added the line you also had earlier in between,
t.Level(i1:end)=cumsum(t.("Total[kg/d]")(i1:end));
so instead of 0 the new level is the corresponding days intake.
But now a new problem arises. For some reason MATLAB continues the calculation for one, sometimes two, extra steps after the limit C has been achieved.
t.("Total[kg/d]") t.Level
'09-Feb-2018 00:00:00' 60.1675605993075 441.719040044621
'10-Feb-2018 00:00:00' 75.4001053361856 517.119145380807 % Limit surpassed
'11-Feb-2018 00:00:00' 75.9144483751560 593.033593755962 %% One extra step before restart
'12-Feb-2018 00:00:00' 90.3032989595419 90.3032989595419
'13-Feb-2018 00:00:00' 31.4822180781400 121.785517037682
'14-Feb-2018 00:00:00' 38.9737463992117 160.759263436894
'15-Feb-2018 00:00:00' 81.3700463998947 242.129309836788
'16-Feb-2018 00:00:00' 113.826935289976 355.956245126765
'17-Feb-2018 00:00:00' 84.0557641723095 440.012009299074
'18-Feb-2018 00:00:00' 110.031383075503 550.043392374577 % Limit surpassed
'19-Feb-2018 00:00:00' 82.9451997037377 632.988592078315 %% One extra step
'20-Feb-2018 00:00:00' 105.933775656499 105.933775656499
It's not much of a problem that a limit is surprassed once, since I can just lower C = 400 or similar to counter this. But that one extra step might cause trouble down the road, do you know is it just a problem with MATLAB or can that extra step be "undone"?
" and didn't have the time to look carefully."
More accurately, I didn't take the time...
Would have to see the code as you implemented it in its entirety; the problem isn't MATLAB, it does precisely what you tell it to! :) You've got a logic error in your indexing, probably.
I'd again have to see the actual code; NB in the above the line with the zero resetting is there only to show the result of the indexing positioning; it's made more obvious by being able to see the introduced zero to show which element in the time series is the one that i1 is then pointing to--the next entry past the one that satisfies the criterion.
The actual code would then use the following line
>> tt.Level(i1:end)=cumsum(tt.("Daily Fill (l)")(i1:end))
tt =
14×2 timetable
Time Daily Fill (l) Level
___________ ______________ ______
01-Jan-2018 50.21 50.21
02-Jan-2018 69.12 119.33
03-Jan-2018 39.21 158.55
04-Jan-2018 35.19 193.74
05-Jan-2018 70.79 264.53
06-Jan-2018 45.28 309.81
07-Jan-2018 103.74 413.55
08-Jan-2018 104.09 517.64
09-Jan-2018 49.95 49.95
10-Jan-2018 9.76 59.70
11-Jan-2018 8.65 68.36
12-Jan-2018 2.47 70.83
13-Jan-2018 28.31 99.14
14-Jan-2018 113.38 212.52
>>
that refers back to the daily inputs and so the first element is the daily intake, not zero. If one were to sum the Level variable column instead, then you would actually want to normalize back to the timestep earlier so it (the level) did reflect the nonzero input.
Which way you choose to do it is simply a matter of preference. Perhaps I confused the issue by trying to show the former that really doesn't enter into the final solution.
You definitely should get a robust solution that only finds the limit once each iteration and starts over immediately the next timestep; anything else is a logic error, sorry! :)
Ah yes that's similar to what I had in mind!
My complete code is the following and the above shown results are with this.
t = TT_bat_KG ;
t.Level=zeros(height(t),1);
C = 500; % Capacity level
t.Level=cumsum(t.("Total[kg/d]")) ;
i1 = 1;
i2 = find(t.Level >= C,1);
while i2 <= height(t)
i1 = i2 + 1;
t.Level(i1:end) = t.Level(i1:end) - t.Level(i1);
t.Level(i1:end)=cumsum(t.("Total[kg/d]")(i1:end));
i2 = find(cumsum(t.("Total[kg/d]")(i1:end))>=C,1) + i1;
end
In my eyes it should indeed restart immediately. I have tried to switch the >= sign into both == and <= signs, but then I get an error regarding indexing (Index exceeds the number of array elements (365) & Error in tabular/dotParenReference, b = b(rowIndiced).
Any idea what might be wrong with the code?
Yeah, there's an "off by one" logic error in the adjustment for the offset from origin in the calculation of i2; use
i2 = find(cumsum(t.("Total[kg/d]")(i1:end))>=C,1) + i1-1;
instead. i2 is from origin of the subset vector which is one-based so the offset is/needs be 0-based.

Sign in to comment.

Categories

Products

Release

R2019b

Asked:

on 24 Jan 2020

Edited:

on 29 Jan 2020

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!