Inf resulted when calculating mean

I am calculating the mean pe each year from 1993-2021. I have rounded my data set, drdata, to the 2 decimal place. However, the stockpe resulted in Inf for every year. I don't understand why this happens. I have already rounded it.
%drdata column 2 is date
%drdata column 4 is the pe ratio data
[year,~,~] = datevec(728110)
%storing the years in y (months and dates are not necessary)
%dates are stored in the 2nd column
[y,~,~] = datevec(drdata(:,2));
%calculating mean stock return for each year, return is in the 7th column
unique(y)
mask = y >= 1993 & y <= 2021;
nnz(mask)
nnz(isnan(drdata(mask,4)))
for k=1993:2021
stockpe(k-1992) = round(mean(drdata(y==k,4)),2);
%1st element will correspond to 1993, 2nd - 1994 and so on
end

4 Comments

unique(y)
nnz(mask)
nnz(isnan(drdata(mask,4)))
those three lines are debugging steps, not regular commands needed by working code. please tell us what the output was of those three lines.
You would get inf if no entries in y were in the range 1993 to 2021
unique(y) shows each year 1993-2021.
nnz(mask) shows 1529482
nnz(isnan(drdata(mask,4))) shows 0
Thanks!!
temp = drdata(y==1993,4);
size(temp)
min(temp), max(temp)
Please show the output of these debugging commands
temp = 33479 x 1 double
size(temp) = 33479
min (temp) = -2950
max (temp) = Inf

Sign in to comment.

 Accepted Answer

You have inf in your data. mean() of data is sum() of the data divided by the number of elements of the data. sum() that includes inf is going to be inf (unless the data includes nan or -inf) and inf divided by a finite number is inf.
If the inf represent missing data, delete those entries before processing the mean()
Also consider using grpstats() or splitapply()

6 Comments

Thank you for your reply!!
However, I dont really get it. Do you mean that there is empty space in the pe column?
Or do you mean that the sum/no. of data results in infinite number (with many decimal places)? but i have rounded the ans
round(mean(drdata(y==k,4)),2);
Please attach your data in a .mat file so that we have something to work with. Make it easy for us to help you, not hard.
I saved the data to .mat but the data set is too large it exceeded the 5MB limit.
Do you mean that there is empty space in the pe column?
No, if there was empty space in the pe column and you use readmatrix() or readtable(), the empty space would show up as NaN.
You must have actual entries of "inf" in the data.
Or do you mean that the sum/no. of data results in infinite number
No, you have actual inf in the data.
temp = drdata(y==1993,4);
if isempty(temp)
fprintf('congratulations, all 1993 data is finite\n');
else
row_number = find(~isfinite(temp));
value1993 = drdata(row_number, 4);
bad_locations_1993 = table(row_number, value1993);
fprintf('oops, some 1993 data is not finite!\n');
bad_locations_1993
end
Thank you very much! with this code I was able to find inf no. and replace them with nan. The codes work now. Thanks a lot!!!!!
I would suggest
mask = ~isfinite(drdata(:,4));
drdata(mask,4) = nan;
You do not need to loop.
Or you could
drdata = standardizemissing(drdata, inf);

Sign in to comment.

More Answers (0)

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!