I resolved this question myself and I figured I would answer my own question for anybody else that came across this at any point.
I realized the data I was using was incorrect, thus why there is no bell curve for the thermal noise above. With new data, I arrived with the following intensity histograms:
My first note is that, Johnston noise wasn't really what that haze should be called, honestly I should really be calling it thermal noise, which has been tested multiple times and shows very strong correlation.
Second, I thought about only getting rid of the noise on the red layer, but I found out the noise came in as ABOUT 60% red, 35% blue, and 5% green. I also noticed around the edge of the image, and especially in the corners the thermal noise got gradually stronger, for now I decided to crop the outer 10% of the image, later I will revise this with a filter so I don't have to crop it.
Third, my efforts to somehow spread the peaks out so I would have more data to work with, never really got very far.
- for n ϵ {2.0, 3.0, 4.0...}, I tried creating a new matrix as: Data.*n
- for n ϵ {1.5, 2.0, 2.5...}, I tried creating a new matrix as: Data.^n
- for n ϵ {1.5, 1.7, 2.0...}, I tried creating a new matrix as: n.^(Data)
The problem was as I spread the peaks apart I was also spreading the second peak from the max value by the same scale. This was disastrous when using the data as an exponential or even raising the data to an exponent. Eventually I ended up with a graph that peaked at 3/1000 and only had about 50 pixels between 5 and 1000 on the histogram. The best solution I came up with was to add the colour intensities in quadrature, and average each picture with the rest normally. The layers should be added in quadrature because the thermal noise curve spreads to about 30% before it is about the same scale as the bad pixel noise. When you add in quadrature the thermal noise stays in the first part of the graph but it groups the bad pixels together more and it moves the second peak from about 33% of the max intensity to about 50%. Theoretically it looks like this:
Fourth, with the new pictures the histogram seemed to really fit the standard distribution curve nicely. One very high bell curve for the thermal noise, overlapping a second very small bell curve for the bad pixels. Rather than finding the minimum in my data as I mentioned above, I realized there is only a single correct answer of how to go about finding the point where I should place my threshold.
- If you have two overlapping bell curves: f(x) & g(x)
- To remove as many volume elements of the first as possible: ∫ f(x) dx
- But leave as many from the second as possible: ∫ g(x) dx,
- A good approximation is: x ϵ (-∞, f(x) ∩ g(x)]
I am currently working on this approach so I don't have an overlap of the curves I found to show you, I'm not sure how close I can get the match to be because of all the noise but I'm trying a few peak preserving smoothing algorithms before I try to fit the bell curves. It's only an approximation because the higher curve will normally be decaying much slower at the intersection than the smaller one, and it also varies with the standard deviation and the distance between the means as well. I assume some perfect mathematical solution exists or some loop could be created to actually find the optimal solution but I couldn't find one with a brief search so I will revisit this later.
The curves above were generated with the following algorithm:
- The images were imported
- The outer 10% was cropped out of the image
- The images were converted to doubles
- The red, green, and blue layers were added in quadrature. This is the most important part.
- The the mean of each individual pixel intensity was taken for all the images
- The intensities were scaled to 1000 and rounded up
If this isn't clear enough or if my logic has failed me in any way comment on this answer or send me an email and I will be happy to help out or revise my solution, however I shall mark this solution as the answer to the question for the time being.