Problem in extracting data from netcdf

5 views (last 30 days)
Zhou Ci
Zhou Ci on 28 Feb 2022
Edited: Cris LaPierre on 1 Mar 2022
I am trying to collocate two datasets. My netcdf file is 113x101x24 means I have data for 24 hours. There's another text file that contains data. In this text file time is present in two columns (HH MM). I have extracted the time from this text file in to one column vector so, for example 4 hours 25 minutes is now 425.
Below is time from netcdf file. It's like 0, 60, 120, 180, 240...
time
Size: 24x1
Dimensions: time
Datatype: int32
Attributes:
long_name = 'time'
units = 'minutes since 2017-06-05 00:30:00'
time_increment = 10000
begin_date = 20170605
begin_time = 3000
vmax = 999999986991104
vmin = -999999986991104
valid_range = [-999999986991104 999999986991104]
origname = 'time'
fullnamepath = '/time'
My question is how can I extract the data from netcdf at 4 hrs ? As I am doing collocation so. e.g. If from txt file my time is 425 (4 hrs 25 minutes) how to extract the 4 hrs time from netcdf file ? Or second case, if it's 435 from text file, it should be 5 from netcdf?
Any help is highly appreciated

Answers (1)

Cris LaPierre
Cris LaPierre on 28 Feb 2022
You haven't shared your file or a link to the source, so here's my best guess.
Based on what I see in the time attributes, the first sheet of your data corresponds to 2017/06/05 00:30:00. The begin_time is llsted as 3000 and the time_increment as 10000. This must mean 00:30:00 and 1:00:00 (1 hour). There is no data, then, that corresponds to 4:00:00. You can find data for 3:30:00 and 4:30:00.
To extract a portion of the data, specify the startLoc and count (see this example).
So to extract the data for 3:30:00, I could do this (untested)
data330 = double(ncread(ncFileName,'varNm',[1 1 3],[inf,inf,1]))
and for 4:30:00, I could do this
data430 = double(ncread(ncFileName,'varNm',[1 1 4],[inf,inf,1]))
  2 Comments
Zhou Ci
Zhou Ci on 1 Mar 2022
@Cris LaPierre I tried above line of code and it gave me this error:
Error using internal.matlab.imagesci.nc/read (line 613)
START has incorrect number of elements (3). The variable has 1 dimensions.
Error in ncread (line 66)
vardata = ncObj.read(varName, varargin{:});
So, I changed it to
data330 = double(ncread(fname_MERRA2,'time',[3],[1]))
data330 =
120
My file is too large that's why I didn't attach it. Secondly, my question is about comparing the netcdf time with text file time. It's fine if there comes a difference of few minutes. Lastly I have large number of netcdf files whose time I want to compare with time in text file (extracted it and made a column vector). 500 means 5 hrs and 00 minutes. 535 means 5 hrs and 35 minutes.
I read netcdf time:
24×1 int32 column vector
0
60
120
180
240
300
360
420
480
540
600
660
720
780
840
900
960
1020
1080
1140
1200
1260
1320
1380
Cris LaPierre
Cris LaPierre on 1 Mar 2022
Edited: Cris LaPierre on 1 Mar 2022
It all depends what variable you are trying to load with the code I shared. It was intended to be used for loading the 113x101x24 data, not 'time'. It works on a netcdf file I have access to, but may have to adjust it a little for it to work for your specific file.
You need to specify a start and count for each dimension. of your (see the links above). Your code is starting at the 3rd row of 'time' and reading in 1 number (the [3],[1] inputs). To extract all the times, I would do one of the following based on what I needed
% as datetimes
t = datetime(2017,06,05,0,30,0) + minutes(double(ncread(fname_MERRA2,'time')));
% as elapsed time
t = duration(0,30,0) + minutes(double(ncread(fname_MERRA2,'time')));
The challenge is converting your text file time to the same format. You can find other posts here about it. I would probably read the text file using readmatrix and then convert to a duration doing something like this.
tft = readmatrix('filename.txt');
h = floor(tft/100);
m = rem(tft,100);
t_dur = duration(h,m,0)
Once you have both times captured, you might be able to find the corresponding sheet index in your data for each text tile time using ismembertol with the following syntax
Here, LocB is an array of the index location in B for each element in A that is a member of B. You just have the find the correct tolerance to use.
EDIT: ismembertol only accepts single and double datatypes. That means any duration or datetime datatype will need to be converted to one of those. I would use the minutes function to convert durations to equivalent minutes (a double). That would allow you to specify the tolerance using meaningful values.

Sign in to comment.

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!