nanmean returns -Inf

14 views (last 30 days)
balsip
balsip on 5 Sep 2017
Edited: John BG on 6 Sep 2017
Vector A is 7627x1 vector with 150 values, 10 of which are negative, and the rest are NaNs.
Calling nanmedian(A) returns a real number value ( 2.7462). Calling nanmean(A) returns " -Inf". Why does nanmedian return a real value and nanmean does not? How do I fix it to get a real value returned from nanmean?
Of interest may be how I am arriving at vector A:
A=(X1./Y1)./(X2./Y2);
I also tried a for loop unsuccessfully. It returned the same results as the non-for loop version above:
for i=1:length(X1)
A(i)=(X1(i)./Y1(i))./(X2(i)./Y2(i));
end
X1 and X2, are 7627x1, mostly negative real numbers. Y1 and Y2, are 7627x1, all positive real numbers.
  2 Comments
balsip
balsip on 5 Sep 2017
Semi-solved:
One X2 value was 0 on the nose, exploding the denominator for that instance of the equation.
This still doesn't tell me why nanmedian returned a real number when nanmean returned (-Inf).
Walter Roberson
Walter Roberson on 5 Sep 2017
nanmean for a vector x is the same as
t = x(~isnan(x));
result = mean(t)
so if there is an +/- inf in the data then it is not affected by the removal of the nans . mean() of data that includes +/- inf is +/- inf if all of the inf are the same sign and nan otherwise. Only a single +/- inf is needed to have an infinite sum and so an infinite mean.
nanmedian of the vector x is the same as
t1 = x(~isnan(x));
t2 = sort(t1);
if mod(length(t2), 2) == 0
result = 1/2 * (t2(end/2)+t2(end/2+1))
else
result = t2( ceil(end/2) );
end
This will produce +/- inf only if at least half of the values are +/- inf . In the case of a vector with the same number of +inf and -inf and no other values, the result could be nan due to the attempt to average the -inf and +inf that would then be the two central elements. At least half infinite is needed in order for the middle elements after sort to end up being infinite for an infinite result.

Sign in to comment.

Accepted Answer

John BG
John BG on 5 Sep 2017
Edited: John BG on 6 Sep 2017
Hi Balsip
1.
while
A=[NaN NaN -12 NaN NaN NaN NaN -1 -5 -8 NaN NaN NaN NaN -20 NaN NaN -3];
mean(A)
median(A)
nanmean(A)
nanmedian(A)
=
NaN
=
NaN
=
-8.1667
=
-6.5000
as expected, mean and median ignore NaN
mean([-12 -1 -5 -8 -20 -3])
=
-8.1667
median([-12 -1 -5 -8 -20 -3])
=
-6.5000
.
2.
however when 1/0
A=[NaN NaN -12 NaN NaN NaN NaN -1 -5 -8 NaN NaN 1/0 NaN -20 NaN NaN -3];
mean(A)
median(A)
nanmean(A)
nanmedian(A)
ans =
NaN
ans =
NaN
ans =
Inf
ans =
-5
or Inf are part of vector A,
A=[NaN NaN -12 NaN NaN NaN NaN -1 -5 -8 NaN NaN Inf NaN -20 NaN NaN -3];
mean(A)
median(A)
nanmean(A)
nanmedian(A)
ans =
NaN
ans =
NaN
ans =
Inf
ans =
-5
.
3.
nanmean takes into account Inf values but nanmedian doesn't. This is because
Y1
and/or
X2
have one or more null values, introducing Inf s elements in A.
A=(X1./Y1)./(X2./Y2);
.
4.
to avoid this, either directly correct Infs to NaNs
A(find(A==Inf))=NaN
or on Y1 and X2, remove their nulls
tol=.000001;
Y1(find(Y1==0))=tol;
X2(find(X2==0))=tol;
Let tol be a really small, small enough so it can be ignored.
5.
It could also be that Y2 and/or X1 elements take Inf values but I assumed from the question that such is not the case, so only one or more elements of Y1 and X2 are null.
6.
Balsip, please note that although
A=(X1./Y1)./(X2./Y2);
and
for i=1:length(X1)
A(i)=(X1(i)./Y1(i))./(X2(i)./Y2(i));
end
are mathematically the same, the time consumption of the compact expression is 1 order of magnitude better than the for loop
L=1e7;
X1=randi([1 1e4],1,L);Y1=randi([1 1e4],1,L);
X2=randi([1 1e4],1,L);Y2=randi([1 1e4],1,L);
tic
A=(X1./Y1)./(X2./Y2);
toc
Elapsed time is 0.038204 seconds.
>> tic
for i=1:length(X1)
A(i)=(X1(i)./Y1(i))./(X2(i)./Y2(i));
end
toc
Elapsed time is 0.236449 seconds.
.
the operator ./ is optimised against the for loop you attempted to use a possible solution.
.
Balsip
if you find this answer useful would you please be so kind to consider marking my answer as Accepted Answer?
To any other reader, if you find this answer useful please consider clicking on the thumbs-up vote link
thanks in advance
John BG
  2 Comments
balsip
balsip on 5 Sep 2017
Thank you, John. I appreciate the detail in your answer.
The for loop was just a shot in the dark to rule out something odd behaviors in the ./ operator, but I hear your point regarding how expensive it would have been.
I've eliminated my initial problem with the following code:
for i=1:length(A)
if X2(i)==0
A(i)=NaN;
end
end
John BG
John BG on 6 Sep 2017
Edited: John BG on 6 Sep 2017
happy to help.
Again, consider using the following
A(X2==0)=NaN;
instead of the for loop you have built involving X2 and A, the reason being
L=5e7;
X1=randi([1 1e4],1,L);Y1=randi([1 1e4],1,L);
X2=randi([1 1e4],1,L);Y2=randi([1 1e4],1,L);
A=(X1./Y1)./(X2./Y2);
tic
for i=1:length(A)
if X2(i)==0
A(i)=NaN;
end
end
toc
Elapsed time is 0.277053 seconds.
L=5e7;
X1=randi([1 1e4],1,L);Y1=randi([1 1e4],1,L);
X2=randi([1 1e4],1,L);Y2=randi([1 1e4],1,L);
A=(X1./Y1)./(X2./Y2);
tic
A(X2==0)=NaN;
toc
Elapsed time is 0.072252 seconds.

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!