Extract identical values from a data set

Question

Sofya on 2 Jul 2014

0
Link

Direct link to this question

https://www.mathworks.com/matlabcentral/answers/139914-extract-identical-values-from-a-data-set

Commented: Geoff Hayes on 24 Jul 2014

Accepted Answer: Geoff Hayes

Hi everyone,

I have a data set for polar coordinates from this command:

[TH,R,Z] = cart2pol(xdata,ydata,zdata);

I then want to find same values of R and corresponding to them values of Z. I have tried these commands:

u=unique(R);
n=histc(R,u);
u(n>1);
find(R==u(n>1))

which gives me an error: Error using == Matrix dimensions must agree.

Error in NSMatlabNPC (line 108) find(R==u(n>1))

And tried to use

find(diff(R)==0)

which returned me an empty matrix 1 by 0.

The problem is probably that I have several identical R of different values and these functions might not be suitable to identify them separately. Are there any other ways I could use to access the same R values?

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Geoff Hayes on 2 Jul 2014

1
Link

Direct link to this answer

https://www.mathworks.com/matlabcentral/answers/139914-extract-identical-values-from-a-data-set#answer_143383

The error Error using == Matrix dimensions must agree is raised because u(n>1) contains a subset of the elements of R and so these two vectors are of different dimension and when comparing the two with ==, it will fail because of this difference in dimension.

Since R is a distance value (from http://www.mathworks.com/help/matlab/ref/cart2pol.html) then you may have different values that could be considered the same but due to the double data type used you will not be able to group the two as identical using (say) unique. For example,

 1.1234567
 1.1234568

Should the above two numbers be considered the same? If so, then you can group like values in R that should be considered identical, given a tolerance.

We sort the data first, and then do the grouping. If we assume that each of the TH, R, and Z are vectors (as opposed to matrices) we can do

data = [R Z TH];

Now sort the data as

sortedData = sortrows(data,1);

In the above, we just sort on the first column of data which corresponds to our R data. We can then iterate over the data and find all those elements that should be considered part of the same group given a tolerance

 myTol       = 0.00001; % use myTol to determine if 2 values should be consided same
 grpStartIdx = 1;       % start index of group
 grpStopIdx  = 1;       % stop index of group
 groupedData = {};      % a cell array for the grouped data
 atGroupIdx  = 0;
 % iterate over each row of sortedData starting at second element
 for k=2:size(sortedData,1)
   % compare the first element with the previous
   if abs(sortedData(k,1)-sortedData(k-1,1)) < myTol
      % the two consecutive elements are considered to be identical
      % so must be part of same group
      grpStopIdx = k;
   else
      % the two consecutive elements are not identical since their difference
      % is greater than myTol so we have come to a new group
      grpStopIdx = k-1;
      % copy all data (R,Z,TH) as a group to the cell array
      atGroupIdx = atGroupIdx + 1;
      groupedData{atGroupIdx} = sortedData(grpStartIdx:grpStopIdx,:);
      % update next group indices
      grpStartIdx = k;
      grpStopIdx  = k;
   end
 end
 % copy the last group
 atGroupIdx = atGroupIdx + 1;
 groupedData{atGroupIdx} = sortedData(grpStartIdx:grpStopIdx,:);

The above may contain a little more code that you wish (and not as neat as using the built-in MATLAB functions), but each group of near-identical R data (as determined by myTol) are combined together in the cell array groupedData for further processing.

8 Comments
Show 6 older commentsHide 6 older comments

Sofya on 24 Jul 2014

Hi Geoff, Thanks for your quick reply!

If I try and extract data with the indices idcs, I think idcs(idcs==k) = []; doesn't help as at the end length(new)=length(sortedData), so it repeats data with same indices.

for k=1:length(sortedData)
     % get indices of all elements in first column that satisfy 
     % the absolute difference being less than the tolerance
     idcs = find(abs(sortedData(k,1)-sortedData(:,1)) < myTol);
     % idcs has at least one index, k, so we have to remove it
     idcs(idcs==k) = [];
     % if idcs is non-empty, then we have some work to do!
     if ~isempty(idcs)
        % do something for each index in idcs
        new{k}=sortedData(idcs,1);
     end
    end

Do you know where the problem is?

Geoff Hayes on 24 Jul 2014

Sofya - where (and why) in the code are you comparing length(new)==length(sortedData)? (With == and not =.) Have you stepped through the code and made sure that idcs(idcs==k) = []; does reduce the idcs vector by one with the removal of the k index?

Sign in to comment.

Answer 2

the cyclist on 2 Jul 2014

0
Link

Direct link to this answer

https://www.mathworks.com/matlabcentral/answers/139914-extract-identical-values-from-a-data-set#answer_143372

I think you probably want to use the ismember() command.

3 Comments
Show 1 older commentHide 1 older comment

Sofya on 2 Jul 2014

Following your kind advice I have tried this command:

u=unique(R);
[counts, bins] = histc(R, u);
[u counts];
[R bins];
R(ismember(bins, find(counts > 1)))

Which does correctly gives an output eliminating all unique values. But how can I then get the positions of these values in R?

the cyclist on 2 Jul 2014

I have to admit am not digging into the details here (sorry!), but note that the ismember command has a second output argument, which is an index that might help you.

doc ismember

for details.

Sign in to comment.

Extract identical values from a data set

0 Comments
Show -2 older commentsHide -2 older comments

Accepted Answer

8 Comments
Show 6 older commentsHide 6 older comments

More Answers (1)

3 Comments
Show 1 older commentHide 1 older comment

See Also

Categories

Tags

Community Treasure Hunt

Extract identical values from a data set

0 Comments Show -2 older commentsHide -2 older comments

Accepted Answer

8 Comments Show 6 older commentsHide 6 older comments

More Answers (1)

3 Comments Show 1 older commentHide 1 older comment

See Also

Categories

Tags

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

8 Comments
Show 6 older commentsHide 6 older comments

3 Comments
Show 1 older commentHide 1 older comment