Get all positions (row,column) that have a certain value in a matrix and get the values of these same positions in a new matrix

4 views (last 30 days)
Hi,
So basically I have 2 matrices 60x60 (D and cos). Let's say matrix D stores the euclidean distance between each pair of vectors and cos stores the cosine similarity between each pair of the same vectors.
I want to plot the cosine similarity as a function of the distance between 2 vectors. Since the distance between two vectors repeats itself several times in the matrix D, I've got the unique values in the matrix. Then, I've found the all the positions (rows and columns) that had this unique values and stored them in a matrix M.
Now, I want to go to the matrix that contains the cosine similarity values and get the values at the same positions stored in matrix M. In the end, I want to average all the values so I can plot them as a function of the distance.
This is my code:
D =xlsread('C:\Users\Joana\Desktop\PIV_dist_cosSim\dist\dist_CoordVectorPIV.xlsx');
uD = unique(D(:,1));
source_dir = 'C:\Users\Joana\Desktop\PIV_dist_cosSim\cosSim';
source_files = dir(fullfile(source_dir, '*.xls'));
c={source_files.name};
s = sort_nat(c);
struct_files = cell2struct(s,'name',1);
for i=1:length(uD)
v = uD(i);
[x,y] = find (D == v);
for j = 1:length(struct_files)
cos = xlsread(fullfile(source_dir, struct_files(j).name));
szCos = size(cos);
M = [x,y];
cm{i} = num2cell(M,1);
%idx = sub2ind(szCos,M(:,1),M(:,2));
%resCos = cos(idx);
end
end
When I run this, I get this error (this is why the last 2 lines are commented on the code I posted):
Error using sub2ind (line 43)
Out of range subscript.
Error in Dist_cosSim (line 33)
idx = sub2ind(szCos,M(:,1),M(:,2));
which indeed seems to be related to the number of rows and columns to search in the size of cos (which is 60x60). Indeed, when I store all matrices M from the loop in a cell cm{i} = num2cell(M,1); I see that some matrices M have more than 60 rows and 60 columns, and this is due to the repetition of values (the same row appears for example twice and the same happens for the column, which is strange).
If anyone has an idea on how to solve this or a better way to solve the problem, it will be much appreciated!
Thanks a lot,
Joana
  3 Comments
Joana Leite
Joana Leite on 15 Mar 2018
Yes, I have more, it depends on the number of files in the source directory. But the loop is not my problem at the moment. For each cos matrix I want to get all the values at the same positions stored at M. For each i, I will have several rows and columns at D, because the same distance value (v) appears several times in the matrix D. So, I've got all of then into matrix M. Now I want to find all of them (rows and columns) in each cos matrix and average their values. However, my problem is that error, because I've checked and all the cos matrices have indeed 60x60 dimensions. However, at M, sometimes the same rows appears repeated and I get a M matrix with 68*2, let's say. But I really don't know how to solve this.
About the inefficiency, you are absolutely right, I should have opened the files before, not inside the other loop.
Thanks a lot
Guillaume
Guillaume on 15 Mar 2018
So for a single cos, you want the average of the cos values at identical D values? That's easy to do without a loop.
However, the only cause for an 'Out of range subscript' error when you do
sub2ind([nrows, ncols], x, y)
is if any of the x is greater than nrows or any of the y is greater than ncols (or x or y <= 1). Again, the number of elements in x and y is irrelevant. M could be 5000000x2, the call would still succeed as long as all M(:, 1) <= szCos(1) and all M(:, 2) <= szCos(2).
To find the problem with your current code you could add, before the sub2ind:
assert(all(max(M, [], 1) <= size(cos)), 'Size of cos is %d x %d yet wanting to access row %d or column %d', size(cos, 1), size(cos, 2), max(M(:, 1)), max(M(:, 2)))

Sign in to comment.

Accepted Answer

Guillaume
Guillaume on 15 Mar 2018
D = xlsread('C:\Users\Joana\Desktop\PIV_dist_cosSim\dist\dist_CoordVectorPIV.xlsx');
[uD, ~, uid] = unique(D(:,1));
source_dir = 'C:\Users\Joana\Desktop\PIV_dist_cosSim\cosSim';
source_files = dir(fullfile(source_dir, '*.xls'));
source_files = sort_nat({source_files.name}); %no need to convert back to structure
meancos = zeros(numel(uD), numel(source_files));
for fileidx = 1:numel(source_files)
cos = xlsread(fullfile(source_dir, source_files{fileidx}));
assert(isequal(size(cos), size(D)), 'Size of cos is not equal to size of D for file %s', source_files{idx});
meancos(:, fileidx) = accumarray(uid, cos(:), [], @mean);
end
In the loop above, meancos(r, c) is the mean of all the cos values corresponding to unique value uD(r) for source_files{c}
  4 Comments
Guillaume
Guillaume on 15 Mar 2018
Ah, so it's an error coming from excel. You'll probably get the same error with your code. Anyway, it's a problem with reading the file, not the code logic.
What is source_files{fileidx} when the error occurs? It may give you a clue.

Sign in to comment.

More Answers (0)

Categories

Find more on Creating and Concatenating Matrices in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!