Easy way to remove outliers on a graph and tell me what the values deleted are?
6 views (last 30 days)
Show older comments
I am working on some correlations with some data and seeing if there is a linear correlation, however, I am trying to remove some outliers and I don't know what the best way to go about it is. I have seen some functions like the deleteoutliers code (<http://www.mathworks.com/matlabcentral/fileexchange/3961-deleteoutliers>) However, I am trying to find the outliers in the correlation, not just a set of numbers. I need a code that will tell me what the outliers are and then help me remove them so I can have a new set of data without outliers. Is there an easy way to do this having an x and y value?
I have been using cook's distance to determine my outliers but I don't know how to remove those outliers and make a new table of values.
%Cooks distance%
X=x;
Y=y;
% Use regstats to calculate Cook's Distance
stats = regstats(Y,X,'linear');
% if Cook's Distance > n/4 is a typical treshold that is used to suggest
% the presence of an outlier
potential_outlier = stats.cookd > 4/length(X);
% Display the index of potential outliers and graph the results
X(potential_outlier)
plot(X,Y, 'b.')
hold on
plot(X(potential_outlier),Y(potential_outlier), 'r.')
pause
0 Comments
Answers (1)
John Knollmeyer
on 26 Jul 2016
If you just need to take your X & Y points and use potential_outliers to filter out outliers, then indexing with inverse of potential_outliers would give you the non-outlying points:
non_outlier_x = X(~potential_outlier);
non_outlier_y = Y(~potential_outlier);
plot(non_outlier_x, non_outlier_y, 'b*');
See Also
Categories
Find more on Linear and Nonlinear Regression in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!