Extract identical values from a data set

17 views (last 30 days)
Hi everyone,
I have a data set for polar coordinates from this command:
[TH,R,Z] = cart2pol(xdata,ydata,zdata);
I then want to find same values of R and corresponding to them values of Z. I have tried these commands:
u=unique(R);
n=histc(R,u);
u(n>1);
find(R==u(n>1))
which gives me an error: Error using == Matrix dimensions must agree.
Error in NSMatlabNPC (line 108) find(R==u(n>1))
And tried to use
find(diff(R)==0)
which returned me an empty matrix 1 by 0.
The problem is probably that I have several identical R of different values and these functions might not be suitable to identify them separately. Are there any other ways I could use to access the same R values?

Accepted Answer

Geoff Hayes
Geoff Hayes on 2 Jul 2014
The error Error using == Matrix dimensions must agree is raised because u(n>1) contains a subset of the elements of R and so these two vectors are of different dimension and when comparing the two with ==, it will fail because of this difference in dimension.
Since R is a distance value (from http://www.mathworks.com/help/matlab/ref/cart2pol.html) then you may have different values that could be considered the same but due to the double data type used you will not be able to group the two as identical using (say) unique. For example,
1.1234567
1.1234568
Should the above two numbers be considered the same? If so, then you can group like values in R that should be considered identical, given a tolerance.
We sort the data first, and then do the grouping. If we assume that each of the TH, R, and Z are vectors (as opposed to matrices) we can do
data = [R Z TH];
Now sort the data as
sortedData = sortrows(data,1);
In the above, we just sort on the first column of data which corresponds to our R data. We can then iterate over the data and find all those elements that should be considered part of the same group given a tolerance
myTol = 0.00001; % use myTol to determine if 2 values should be consided same
grpStartIdx = 1; % start index of group
grpStopIdx = 1; % stop index of group
groupedData = {}; % a cell array for the grouped data
atGroupIdx = 0;
% iterate over each row of sortedData starting at second element
for k=2:size(sortedData,1)
% compare the first element with the previous
if abs(sortedData(k,1)-sortedData(k-1,1)) < myTol
% the two consecutive elements are considered to be identical
% so must be part of same group
grpStopIdx = k;
else
% the two consecutive elements are not identical since their difference
% is greater than myTol so we have come to a new group
grpStopIdx = k-1;
% copy all data (R,Z,TH) as a group to the cell array
atGroupIdx = atGroupIdx + 1;
groupedData{atGroupIdx} = sortedData(grpStartIdx:grpStopIdx,:);
% update next group indices
grpStartIdx = k;
grpStopIdx = k;
end
end
% copy the last group
atGroupIdx = atGroupIdx + 1;
groupedData{atGroupIdx} = sortedData(grpStartIdx:grpStopIdx,:);
The above may contain a little more code that you wish (and not as neat as using the built-in MATLAB functions), but each group of near-identical R data (as determined by myTol) are combined together in the cell array groupedData for further processing.
  8 Comments
Sofya
Sofya on 24 Jul 2014
Hi Geoff, Thanks for your quick reply!
If I try and extract data with the indices idcs, I think idcs(idcs==k) = []; doesn't help as at the end length(new)=length(sortedData), so it repeats data with same indices.
for k=1:length(sortedData)
% get indices of all elements in first column that satisfy
% the absolute difference being less than the tolerance
idcs = find(abs(sortedData(k,1)-sortedData(:,1)) < myTol);
% idcs has at least one index, k, so we have to remove it
idcs(idcs==k) = [];
% if idcs is non-empty, then we have some work to do!
if ~isempty(idcs)
% do something for each index in idcs
new{k}=sortedData(idcs,1);
end
end
Do you know where the problem is?
Geoff Hayes
Geoff Hayes on 24 Jul 2014
Sofya - where (and why) in the code are you comparing length(new)==length(sortedData)? (With == and not =.) Have you stepped through the code and made sure that idcs(idcs==k) = []; does reduce the idcs vector by one with the removal of the k index?

Sign in to comment.

More Answers (1)

the cyclist
the cyclist on 2 Jul 2014
I think you probably want to use the ismember() command.
  3 Comments
Sofya
Sofya on 2 Jul 2014
Following your kind advice I have tried this command:
u=unique(R);
[counts, bins] = histc(R, u);
[u counts];
[R bins];
R(ismember(bins, find(counts > 1)))
Which does correctly gives an output eliminating all unique values. But how can I then get the positions of these values in R?
the cyclist
the cyclist on 2 Jul 2014
I have to admit am not digging into the details here (sorry!), but note that the ismember command has a second output argument, which is an index that might help you.
doc ismember
for details.

Sign in to comment.

Categories

Find more on Matrices and Arrays in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!