How to remove duplicate NaN values from an X Y data set

5 views (last 30 days)
I have a large data set a simplified version is data=[1,2;3,4;nan,nan;5,6;nan,nan;nan,nan;10,11]. I need to remove any rows of NaN that are repeated more than once so the above example would become data=[1,2;3,4;nan,nan;5,6;nan,nan;10,11]. The sequence is important
1 2
3 4
NaN NaN
5 6
NaN NaN
NaN NaN
10 11
should become
1 2
3 4
NaN NaN
5 6
NaN NaN
10 11

Accepted Answer

Andrei Bobrov
Andrei Bobrov on 20 Oct 2014
Edited: Andrei Bobrov on 20 Oct 2014
data=[1,2;3,4;nan,nan;5,6;nan,nan;nan,nan;7,8];
d = all(isnan(data),2);
n = [0;diff(d)];
k = find(n == -1);
m = find(n == 1);
d = d + 0;
d(k) = m(1:numel(k)) - k;
out = data(cumsum(d) <= 1,:)
other way
data=[1,2;3,4;nan,nan;5,6;nan,nan;nan,nan;7,8];
d = all(isnan(data),2);
data(strfind(d(:)',[1,1]),:) = [];

More Answers (1)

Hikaru
Hikaru on 20 Oct 2014
There might be more efficient ways to do this, but the following code works.
data=[1,2;3,4;nan,nan;5,6;nan,nan;nan,nan;10,11]
data(isnan(sum(data,2)),:)=[]
  2 Comments
Evan
Evan on 20 Oct 2014
Thanks Hikaru, this is not the outcome I need. The data set must still contain NaN but no sequential duplicates, the output I require is in my original post labeled 'should become'. Thanks again.
Hikaru
Hikaru on 20 Oct 2014
Sorry, didn't catch that earlier. Maybe you can use this approach.
a=[1,2;3,4;nan,nan;5,6;nan,nan;nan,nan;10,11];
[r,c] = size(a)
b = reshape(a,1,r*c)
ind = isnan(a(:))'
repeatednan = strfind([0, ind], [1 1])
b(repeatednan)=[]
reshape(b,r-length(repeatednan)/2,c)

Sign in to comment.

Categories

Find more on Characters and Strings in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!