How to create average of values within a grid box??

25 views (last 30 days)
Hello everyone,
I have data which looks like this:
3127 x 254 x 365 which represents the satellite data (index (3127) along arcs (254) for each day of the year).
I want to split this data into grid squares before averaging (see picture at bottom of the page - bottom right picture where the red lines are the satellite arcs). Once I have the data selected into grid squares I will then average all values within each grid square. Does anyone know how I can do this? I'm more after which commands to use. Should I interpolate the data onto a regular grid?
One idea I have is to use areaquad to create the size of the grid squares and then repeat the dimension to create a matrix. However, I don't know how I can then average all values within each square.
Thanks, Michael

Accepted Answer

Cedric
Cedric on 9 Jun 2014
Edited: Cedric on 10 Jun 2014
You are trying to perform what is called a Zonal Statistics in the "GIS world". Do you really need to do it with MATLAB, or could you perform it as a one time operation in e.g. ArcGIS?
If you need to do it in MATLAB, one way to go is to define a set of compartment IDs for your data points, and to use ACCUMARRAY to get whatever statistics you need by compartment. This is comparable to what is explained here. Compartment IDs can be computed based on coordinates, e.g
compartmentID = colID + (rowID-1) * nRows ;
where compartmentID, rowID, and colID have the same size as your data set (for one year), rowID is computed based on latitudes, and colID is computed based on longitudes.
Once you have compartment IDs (1,2,..), you can get a statistics as follows
data_yr = data(:,:,1) ; % Example for year 1.
compMean = accumarray( compartmentID(:), data_yr(:), [], @mean ) ;
Note that it is sometimes more efficient to compute a mean by dividing sums..
compSum = accumarray( compartmentID(:), data_yr(:) ) ;
compCount = accumarray( compartmentID(:), ones( numel( compartmentID ), 1 )) ;
compMean = compSum ./ compCount ;
=== Illustration =====================
We use a 4 by 6 data set to experiment with:
>> data_yr = 20 + randi( 10, 4, 6 )
data_yr =
27 27 23 27 25 22
28 22 21 24 24 25
28 28 21 30 28 25
24 21 29 21 28 27
Assume that the first dimension corresponds to latitudes (1 to 4) and that the second dimension corresponds to longitudes (1 to 6). We want to get a statistics on 2 by 2 blocks/compartments, so the upper left
27 27
28 22
are in compartment 1 and so on. If we had the following array (which defines the compartment ID for each data point) available
>> compartmentID
compartmentID =
1 1 2 2 3 3
1 1 2 2 3 3
4 4 5 5 6 6
4 4 5 5 6 6
we could easily get a zonal statistics using ACCUMARRAY (I let you read the doc and experiment a bit with this function):
>> compMean = accumarray( compartmentID(:), data_yr(:), [], @mean )
compMean =
26.0000
23.7500
24.0000
25.2500
25.2500
27.0000
and you can check e.g. that the first element is the mean for compartment 1. Now the only thing left is to find a way to build compartmentID. One way to achieve this is as follows (and again, I let you experiment with it to understand):
>> rowId = ceil( (1 : size(data_yr, 1)) / 2 )
rowId =
1 1 2 2
where denom. 2 is the block size along dim 1.
>> colId = ceil( (1 : size(data_yr, 2)) / 2 )
colId =
1 1 2 2 3 3
where denom. 2 is the block size along dim 2.
>> [colID, rowID] = meshgrid( colId, rowId )
colID =
1 1 2 2 3 3
1 1 2 2 3 3
1 1 2 2 3 3
1 1 2 2 3 3
rowID =
1 1 1 1 1 1
1 1 1 1 1 1
2 2 2 2 2 2
2 2 2 2 2 2
And finally
>> compartmentID = colID + (rowID - 1) * max(colId)
compartmentID =
1 1 2 2 3 3
1 1 2 2 3 3
4 4 5 5 6 6
4 4 5 5 6 6
  7 Comments
Cedric
Cedric on 9 Jun 2014
Edited: Cedric on 9 Jun 2014
Awesome. This problem made my own life complicated almost a decade ago to be honest, and I had to use a GIS (namely ArcGIS and its Python Geoprocessor). My grids were/are a bit more complicated though, as they are 3D and multi-scale. My problem is hence to develop efficient enough algorithms for building the aforementioned array of compartment IDs, in a context where I cannot use regularity in the grids' structures.
Engdaw Chane
Engdaw Chane on 7 Jun 2018
Edited: Engdaw Chane on 7 Jun 2018
Hello Cedric and Michael,
How can I do this correctly to all time-series maps please? I am using the wonderful solution. I have 83 x 92 x 480 time-series matrix. It is working nicely when I apply the solution for each matrix separately. But, I have to do the zonal statistics for all maps. I am getting below the expected number of elements in the output file. I tried it this way:
% code
data=reshape(permute(precip,[1,3,2]), [],size(CRU_precip_EA,2)); % convert the data from 3D (83 x 92 480)to 2D (480x92)
rowId = ceil( (1 : size(data, 1)) / 2 );
colId = ceil( (1 : size(data, 2)) / 2 );
[colID, rowID] = meshgrid( colId, rowId );
compartmentID = colID + (rowID - 1) * max(colId);
output=accumarray( compartmentID(:), data(:), [], @mean );
twobytwo=reshape(output,[42,46,480]); % restoring dimension
Here is the problem; I am expecting 927360 numbers of elements. However, I am getting only 916321 number of elements.
I thank you for the valuable help.
Kindly,

Sign in to comment.

More Answers (1)

Reema Alhassan
Reema Alhassan on 4 Jun 2018
hello, I'm trying to do this process on geotiff image so in this line : compartmentID = colID + (rowID-1) * nRows ; how could I get the colID and rowID and nRow from the image ? could you explain more please ?
thank you,

Categories

Find more on Creating and Concatenating Matrices in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!