How to determine the height of normal distribution fit from e.g. hitfit

16 views (last 30 days)
I have produced a histogram with normal distribution fit using the histfit function.
I want to over plot a different normal distribution which has a mean and standard deviation that I know.
This is easy enough, but the scaling is all wrong. How can I adjust the height of this second normal distribution so that its height equals that of the normal distribution produced by histfit?

Accepted Answer

Lukas
Lukas on 15 Jul 2014
Edited: Lukas on 15 Jul 2014
Hello Daniel,
The integral over the normal distribution (=:dist1) from minus infinity to infinity is 1. The integral over the curve you get from histfit (=:dist2) from minus infinity to infinity is equal to the number of data points times the length of one bin. This is where you get the scaling factor from. Now there are two possibilities, you can either scale dist1 up or dist2 down.
Solution1: scale dist1 up
n = 1000; %number of data points
data = randn(1,1000); %generate random data for the showcase
nbins = 20; %number of bins
h = histfit(data,nbins); %do the histfit and save the handle
hold on;
mu = 0;
sigma = 1;
xx = linspace(min(data),max(data),500);
yy = normpdf(xx,mu,sigma);
X = get(h(1),'XData'); %this is a workaround to get the length of one hist bin
lengthBin = X(3,1)-X(2,1);
scalingFactor = n * lengthBin; %This is the needed scaling factor
plot(xx,scalingFactor*yy,'g','Linewidth',get(h(2),'LineWidth')); %plot the second distribution in green with a good linewidth
legend('histplot','fitted distribution','real distribution'); %make legend
result:
Solution 2: scale dist2 down
Sorry, the code posted here is kind of a workaround, because the thing I really wanted to do didn't work. I will post the not working solution below and it would be really kind, if someone could tell me why it doesn't work.
n = 1000; %number of data points
data = randn(1,1000); %generate random data for the showcase
nbins = 20; %number of bins
h = histfit(data,nbins); %do the histfit and save the handle
mu = 0;
sigma = 1;
xx = linspace(min(data),max(data),500);
yy = normpdf(xx,mu,sigma);
X = get(h(1),'XData'); %this is a workaround to get the length of one hist bin
Y = get(h(1),'YData');
xFitDist = get(h(2),'XData'); %get x and y data of the fitted distribution
yFitDist = get(h(2),'YData');
centers = unique(mean(X,1)); %calculate the centers of the bars
counts = Y(2,:);%calculate the height of the bars
lengthBin = X(3,1)-X(2,1);
scalingFactor = n * lengthBin; %This is the needed scaling factor
bar(centers,counts/scalingFactor,1) %plots the normalized bars
hold on;
plot(xFitDist,yFitDist/scalingFactor,'r','Linewidth',3); %plot the fitted and normalized distribution
plot(xx,yy,'g','Linewidth',3);%plot your distiburion on top
legend('histplot','fitted distribution','real distribution'); %make legend
result:
Solution 3: not working
n = 1000; %number of data points
data = randn(1,1000); %generate random data for the showcase
nbins = 20; %number of bins
h = histfit(data,nbins); %do the histfit and save the handle
mu = 0;
sigma = 1;
xx = linspace(min(data),max(data),500);
yy = normpdf(xx,mu,sigma);
X = get(h(1),'XData'); %this is a workaround to get the length of one hist bin
lengthBin = X(3,1)-X(2,1);
scalingFactor = n * lengthBin; %This is the needed scaling factor
% scale current plot. The mistake must be here because it always destroys
% the bar graph
set(h(1),'YData',get(h(1),'YData')/scalingFactor); %take the old YData, scale it and put it back
set(h(2),'YData',get(h(2),'YData')/scalingFactor); %same as above
hold on;
plot(xx,yy,'g','Linewidth',get(h(2),'Linewidth'));%plot your distiburion on top. no scaling, as the others are scaled
legend('histplot','fitted distribution','real distribution'); %make legend
Here is the resulting figure, which obviously is wrong:
I hope I was able to help you and my answer is correct, as this is my first answer here. I would be really glad, if someone could tell me, why the third code is not working (the plot is ugly).
Have a nice day Lukas
  3 Comments
Lukas
Lukas on 15 Jul 2014
You are welcome.
Could you be a bit more precise. I am not able to fully understand your question.
Daniel
Daniel on 15 Jul 2014
What I mean is: Can I take a Gaussian distribution with fixed mean and standard deviation and use least-squares fitting (or something) to fit it to my histogram independently to histfit? In this way, I will be determining the height that minimises a chi-squared value, instead of scaling it to a different distribution.

Sign in to comment.

More Answers (0)

Tags

No tags entered yet.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!