Calculation of probability using Beta ,lognormal and weibull distribution

Question

hina Shakir on 31 Mar 2018

0
Link

Direct link to this question

https://www.mathworks.com/matlabcentral/answers/391836-calculation-of-probability-using-beta-lognormal-and-weibull-distribution

Answered: Jeff Miller on 5 Apr 2018

Hello,

I have computed three different properties of sample objects and have stored the discrete data values in three vectors.I have fit weibull, lognormal and beta distribution on these three vectors. Now how can i find the individual probability of each of the data values using their respective fitted distributions? I want to multiply these probabilities for each element to find the joint probability.

Thanks

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Jeff Miller on 5 Apr 2018

0
Link

Direct link to this answer

https://www.mathworks.com/matlabcentral/answers/391836-calculation-of-probability-using-beta-lognormal-and-weibull-distribution#answer_313542

"if I want to obtain the likelihood of getting a particular combination of values which represent these 3 properties of another cancerous tumor(not from the dataset),i use my fitted distributions for this purpose and i multiply the computed likelihoods to obtain the joint likelihood."

Several comments on this.

First, multiplication is only appropriate here if the 3 properties are independent. If they are at all correlated with one another, the joint likelihood for any given combination of three is not simply the product of the individual ("marginal") likelihoods. You can check this by looking at the pairwise correlations among the three properties across the 200 tumors.

Second, it sounds like you really do want probabilities rather than likelihoods. For that purpose, you would be best to consider each property as falling within a certain numerical range or "bin". For a first pass, I suggest you classify each tumor as falling above vs below the median with respect to each of the three properties. This will give you 8 categories of tumors: 2x2x2, and you can count how many (out of 200) you have in each category. With a sample of only 200, I doubt that you have enough data to get usable estimates of the probabilities in more than 8 bins, but you might try 3x3x3 if you are a daredevil.

Third, it still doesn't sound like you quite have appreciated the difference between likelihoods and probabilities. To maybe make that distinction more meaningful to you, I suggest you do the following: Rescore all of your properties into a different unit of measurement--say, for example, that you divide each property value by 10. I hope you agree that logically this changes nothing. Likewise, if you look at the probabilities in the 2x2x2 bins, you will see that the rescoring has also changed nothing. But if you refit your distributions to the rescored property values, you will see that all of the PDF values (i.e., likelihoods) have increased by a factor of 10. The reason is that the PDF values (likelihoods) must integrate to 1 across the whole range of X, so the PDF values--unlike the corresponding probabilities--depend on the units of measurement for X.

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Answer 2

John D'Errico on 31 Mar 2018

0
Link

Direct link to this answer

https://www.mathworks.com/matlabcentral/answers/391836-calculation-of-probability-using-beta-lognormal-and-weibull-distribution#answer_312886

Edited: John D'Errico on 31 Mar 2018

Open in MATLAB Online

I think you are making what is a fairly common mistake about continuous probability distributions.

Suppose you have a 6 sided die, with the numbers 1-6 on the faces. Assuming a fair die, each number will come up equally often. So it you see the number 2, you know the probability of that event was 1/6. This works because the distribution is a discrete one.

But a continuous distribution does not work that way. Consider a normal distribution, for example.

randn
ans =
     0.69551

The probability that 0.69551 would result from a normal distribution is zero. You might say, but we just got that number! How can it have probability zero? But a continuous distribution has probability zero for ANY single event. We can talk about the probability that we will see a value in the interval [0.6,0.7]. That is given by the integral of the PDF over that interval, or we can use the CDF.

normcdf(.7) - normcdf(0.6)
ans =
     0.032289

So the probability that we would have seen an event that lies in the interval [.6,.7] is 0.032289... But the probability of the exact event that we saw is zero.

You might think you can use the pdf.

normpdf(0.69551)
ans =
      0.31323

I'm sorry. That is NOT a probability. Even though it comes from the probability density function, normpdf does not compute a probability.

Perhaps you understand all of this. Your question suggests that you do not. I would suggest you should probably read this:

https://en.wikipedia.org/wiki/Probability_density_function

Usually when people want to do as you are doing, you will want to use the CDF for the respective distribution, in some way. It might be for an MLE computation, whatever.

So IF you do understand the difference, and how to compute probability over some interval from a cdf, then you can use tools from the stats toolbox, thus normcdf, betacdf, and wblcdf. (I never get the last name right. I always want to type weibcdf.) If you do not have the stats toolbox, then there are still ways to compute the CDF for these distributions, via an appropriate transformation from one of several special functions.

6 Comments
Show 4 older commentsHide 4 older comments

Jeff Miller on 3 Apr 2018

I think that in order to get any correct advice, you will need to give us an answer John's questions: "exactly why are you trying to compute the probability of any given sample? What are you trying to do?"

hina Shakir on 4 Apr 2018

I have computed 3 properties of 200 cancer tumors in 3 data sets. Each dataset of 200 numerical values indicates a trend that the particular property follows in the tumors. Now if I want to obtain the likelihood of getting a particular combination of values which represent these 3 properties of another cancerous tumor(not from the dataset),i use my fitted distributions for this purpose and i multiply the computed likelihoods to obtain the joint likelihood. I hope i explained my problem . Thanks

Sign in to comment.

Calculation of probability using Beta ,lognormal and weibull distribution

0 Comments
Show -2 older commentsHide -2 older comments

Accepted Answer

0 Comments
Show -2 older commentsHide -2 older comments

More Answers (1)

6 Comments
Show 4 older commentsHide 4 older comments

See Also

Categories

Tags

Community Treasure Hunt

Calculation of probability using Beta ,lognormal and weibull distribution

0 Comments Show -2 older commentsHide -2 older comments

Accepted Answer

0 Comments Show -2 older commentsHide -2 older comments

More Answers (1)

6 Comments Show 4 older commentsHide 4 older comments

See Also

Categories

Tags

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

0 Comments
Show -2 older commentsHide -2 older comments

6 Comments
Show 4 older commentsHide 4 older comments