find same dates in two cell array

2 views (last 30 days)
Alex
Alex on 3 Oct 2013
Commented: Cedric on 8 Oct 2013
Hi everyone, i got some troubles in matlab I have one cell array containing 3 columns and 4000 rows. The first colum contain date in order, the second column contain a value min of a sensor and the third column contain a value max of a sensor. I would like to re-synchronize my data in a new tab. My new tab contain date in the first column, i want to find the same date in the old tab and put the min et max value in my newt tab.
I try to do like this:
for numDate=1:4000
for j=1:4000
date=newtab{numDate,1};
date2=oldtab(j,1);
if find(ismember(date,date2))
newtab(numDate,2)=avg(j,2);
break;
end
end
end
but make a loop 4000 in a loop 4000 is juste not possible knowing that i have in real 38 sensors, this brings me o a time of execution around 75minutes! How can i do more faster?
Thank you, Regards,
Alexis
  2 Comments
dpb
dpb on 3 Oct 2013
Edited: dpb on 3 Oct 2013
I don't quite follow--are you saying you simply want the locations in the first array that are in the second? Ignoring that they're dates but just using integers as shorthand example if
a=[ [1:5]' rand(5,2)];
b=[2 3]'
you want the end result
c=[b a(2:3,2:3)];
?
That is the values of the 2nd/3rd columns of a for the indices in column 1 of b that are in a?
I particularly don't get where the avg came from in your example code?
Cedric
Cedric on 3 Oct 2013
How are your dates coded? And how is the second order defined?

Sign in to comment.

Accepted Answer

Cedric
Cedric on 3 Oct 2013
Edited: Cedric on 3 Oct 2013
At this point I don't know how your dates are coded, but here is a simple illustration. Assume that dates are 28, 29, 30, and data associated with these dates are 100, 101, 102 for min's, and 200, 201, 2002 for max's (in the same order):
>> table1 = {28, 100, 200; 29, 101, 201; 30, 102, 202}
table1 =
[28] [100] [200]
[29] [101] [201]
[30] [102] [202]
Assume as well that you have table2 with the same dates in another order and no data (0's):
>> table2 = {30, 0, 0; 28, 0, 0; 29, 0, 0}
table2 =
[30] [0] [0]
[28] [0] [0]
[29] [0] [0]
Now you want to fill table2 with the data of table1 in the correct order with respect to dates of table2. One way to achieve that is to sort dates from table2 and get indices of sorted elements in the original table2:
[~,ix] = sort([table2{:,1}]) ;
and then use these indices as a "look up table" for "distributing" values of table1 in the correct order. This is done as follows:
>> table2(ix,2:3) = table1(:,2:3)
table2 =
[30] [102] [202]
[28] [100] [200]
[29] [101] [201]
  4 Comments
Alex
Alex on 4 Oct 2013
Edited: Alex on 4 Oct 2013
oh god, it works! you rock! :) Thank you very much It took 7 minutes :D
Cedric
Cedric on 4 Oct 2013
Edited: Cedric on 4 Oct 2013
You're welcome! But 7 minutes seems a lot actually for datasets with 4000-ish entries; if you don't mind sending me your files, I can checkout what takes that much time.

Sign in to comment.

More Answers (5)

Alex
Alex on 4 Oct 2013
Edited: Alex on 4 Oct 2013
Yep I think it's normal because in real, i have 38 sensors with 4 tab each, min max avg et std which are generated by a server where the request is long, and ater that I should copy the 4 values in my new table for each date. But I attach my file if you want to see Thanks
edit: I hope you can download it, it don't see any file attached?
  1 Comment
Cedric
Cedric on 4 Oct 2013
Edited: Cedric on 4 Oct 2013
It says file corrupt. Did you load all tables and do something like save('infra.mat') ?
Also, if arrays contain 5 columns, the first one being the date and the others the 4 values that you mentioned, you can copy all values at the same time:
corresp = bsxfun( @eq, datenum( tableMax(:,1) ).', ...
datenum( tableSynchro(:,1) )) ;
[r,c] = find( corresp ) ;
tableSynchro(r,2:end) = tableMax(c,2:end) ;

Sign in to comment.


Alex
Alex on 4 Oct 2013
Nop I have in fact 4 arrays: max,min,avg and std which contain 3 columns: date, value and a string So i just put 4 time your code for each array and i do it or my 38 sensors
  1 Comment
Cedric
Cedric on 4 Oct 2013
Edited: Cedric on 4 Oct 2013
All 4 arrays have different time stamps? If so, then 7 minutes is reasonable, because calling DATE2NUM on 4000 date strings will take about 1s on a "normal" computer, so ~2s per sensor and per value, which sums up to about 5 minutes for 4 values per sensor and 38 sensors.
If all 4 values share the same time stamps, then you can reuse r and c.
Actually .. if I understand well, you can gain a little time by DATE2NUM-ing the first column of tableSynchro only once:
dateNumSynchro = datenum( tableSynchro(:,1) ) ; % Common to all values.
corresp = bsxfun( @eq, datenum( tableMin(:,1) ).', dateNumSynchro ) ;
[r,c] = find( corresp ) ;
tableSynchro(r,2) = tableMin(c,2) ;
corresp = bsxfun( @eq, datenum( tableMax(:,1) ).', dateNumSynchro ) ;
[r,c] = find( corresp ) ;
tableSynchro(r,3) = tableMax(c,2) ;
corresp = bsxfun( @eq, datenum( tableAvg(:,1) ).', dateNumSynchro ) ;
[r,c] = find( corresp ) ;
tableSynchro(r,4) = tableAvg(c,2) ;
corresp = bsxfun( @eq, datenum( tableStd(:,1) ).', dateNumSynchro ) ;
[r,c] = find( corresp ) ;
tableSynchro(r,5) = tableStd(c,2) ;

Sign in to comment.


Alex
Alex on 7 Oct 2013
Hum thank you, indeed, my 4 arrays have same time stamps, but they have not the same length, so if I do what you wrote I obtain "Index exceeds matrix dimensions."
  1 Comment
Cedric
Cedric on 7 Oct 2013
How can they have same time stamps and not same length?

Sign in to comment.


Alex
Alex on 7 Oct 2013
But you right, I should DATE2NUM-ing the first column only once, I gain some seconds thanks :D

Alex
Alex on 8 Oct 2013
And another question, do you know how can i erase the end of my dates? I mean I want only yyyy-mm-dd HH:MM and erase secondes in my first column I try :
for j=1:1008
s=mytable(:,1);
s1 = s(1:end-2);
end
and at the end s contains 138cell and s1 136 why???
  1 Comment
Cedric
Cedric on 8 Oct 2013
Edited: Cedric on 8 Oct 2013
As there is a '0' padding for days and months, you can take the first 10 characters of your dates..
>> d = '2013-03-10 03:30:00' ;
>> d(1:10)
ans =
2013-03-10
but you cannot do this before having done the match, because there are multiple entries for each day, which would all have the same numeric code if you were truncating the time.
You can apply the truncation to each cell content of a cell array as follows:
>> d = {'2013-03-10 03:30:00', '2013-03-10 03:30:00'}
>> truncated = cellfun( @(s)s(1:10), d, 'UniformOutput', false )
truncated =
'2013-03-10' '2013-03-10'
No idea for your last question, you'll have to debug.

Sign in to comment.

Categories

Find more on Dates and Time in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!