The lognormal distribution has two parameters, μ, and σ. These are not the same as mean and standard deviation, which is the subject of another post, yet they do describe the distribution, including the reliability function.
$$ \large\displaystyle R(t)=1-\Phi \left( \frac{\ln (t)-\mu }{\sigma } \right)$$
Where Φ is the standard normal cumulative distribution function, and t is time.
One of the nice features of the lognormal distribution is the estimate of the parameters is similar to estimating the mean and standard deviation of the data using the same functions on our calculator or spreadsheet. There is one difference, though. First, calculate the natural logarithm of each data value.
Let’s say we have the time to failure times for four heater elements. We know the time to failure distribution is lognormal from previous work. We want to estimate the lognormal parameters and estimate the reliability of this type of heater elements at 365 days.
Time to Fail | ln(Time to Fail) |
385 | 5.9532 |
427 | 6.0568 |
490 | 6.1944 |
705 | 6.5582 |
Calculate μ
In the table, we have the time to failure data and the calculation of the natural log of each data reading. To calculate the μ we calculate the mean or average value of the four ln(time to failure) readings.
$$ \large\displaystyle \mu =\frac{5.9532+6.0568+6.1944+6.5582}{4}=6.1907$$
Calculate σ
The calculation of σ requires a little more math. The formula for the calculation of standard deviation includes the sum of values squared and the sum of squares of the values.
$$ \large\displaystyle s=\sqrt{\frac{n\sum\limits_{i=1}^{n}{t_{i}^{2}}-{{\left( \sum\limits_{i=1}^{n}{{{t}_{i}}} \right)}^{2}}}{n(n-1)}}$$
We need the sum of the ln(time to failure) for the second summation term. And the sum of squares for the first summation term. Expanding the table to make the calculations we find the two summation results.
Time to Fail | ln(Time to Fail) | ln(Time to Fail) Squared |
385 | 5.9532 | 35.4411 |
427 | 6.0568 | 36.6846 |
490 | 6.1944 | 38.3706 |
705 | 6.5582 | 43.0100 |
Sum | 24.7626 | 153.5063 |
n equals four in the example, as we have four readings. Inserting the sums and n, and doing the math to find the value of σ, the second parameter for the lognormal distribution.
$$ \large\displaystyle s=\sqrt{\frac{4(153.5064)-{{24.7626}^{2}}}{4(4-1)}}=0.2642$$
Determine reliability at one year
Now that we have the two parameters for the lognormal distribution which describes the life distribution of heater elements based on the four readings, we can estimate the probability of successfully operating for one year. Using the reliability function of the lognormal distribution, insert 365 for t, 6.1907 for μ, and 0.2642 for σ, to find the reliability value at one year.
$$ \large\displaystyle R(t)=1-\Phi \left( \frac{\ln (365)-6.1907}{0.2642} \right)=1-\Phi \left( -1.1007 \right)$$
The standard normal cumulative distribution function (try Excel function =normsdist(-1.1007) or for the CRE exam use a standard normal cumulative distribution table) determines the probability of failure at time, t given the lognormal parameters. Φ(-1.1007) = 0.1355.
Therefore completing the calculations for the reliability function, we have
$$ \large\displaystyle R(365)=1-0.1355=0.8645$$
Thus, give the data, we can expect approximately 86.45% of heater elements to survive for 365 days.
Related:
Lognormal Distribution (article)
Weakest Link (article)
The Normal Distribution (article)
Michael says
Thanks so much for providing a thorough model starting with the data and walking through the practical steps.
Fred Schenkelberg says
Hi Michael,
Thanks for the kind words – it helps to know someone appreciates the work and hopefully it is useful for you too. Anything else you are interested in having worked out examples?
cheers,
Fred
M K Loganathan says
Dearest Mr. Fred, This is most precious item at no cost:-). Really worth of it. Your work is really exceptional.
Amit Chand says
The data analysis is an eye opener.
We have two wire rod mills & each mill has 10 Stands. I want to know the reliability of the stands. We have observed that Mill 1 is more reliable than Mill2 based on No of Failures of stands . If we want to do the log normal analysis for knowing reliability of stands should we take one year or six month data?
Fred Schenkelberg says
Hi Amit,
Use as much data as you have available and is relevant to the current production process.
You don’t have to use lognormal, I would first check on which distribution fits well or use a non-parametric method, if appropriate.
Cheers,
Fred
Mai says
Hi
I am performing a probabilistic sensitivity (Monte Carlo simulation) analysis on different variables using Treeage software and I was wondering if you could help me with this issue, I have a variable with a value of -0.091 and the range is (-0.063–0.0119) and the distribution of that variable is lognormal, how to calculate or estimate the mean and the SD (sigma) value for that parameter?
Thanks
Fred Schenkelberg says
As you know, logarithms do not like negative numbers. If the numbers you are gathering are negative, you may consider transforming them by a fixed offset for example by adding say 0.07 to each reading, so all the values are positive. Just remember to remove the offset at the end when needing the actual values again.
Cheers,
Fred
jack says
Finally! An understandable presentation of what some would consider a difficult subject. Thanks for the straightforward explanation. I wish all my ME professors used your methods.
Fred Schenkelberg says
Thanks John, we do try to make this stuff understandable – tends to make these tools useful that way. cheers, Fred
Kamran says
Thanks, simple and elegant explanation that I can understand.
Carlos says
Dear Fred,
This application is truly amazing! Congrats. Although the use of the Cumulative Distribution Function is very clear to me now, I still struggle with the practical meaning of the Probability Density Function. For the same 365 days we would have P(365) = 0.00226. What does .00226 means in your example? That we have a change of 0.23% of a failure on the 365th day? Also, the maximum of the PDF is (roughly) between 400 and 500. So, that is when I should expect the most frequent maintenance calls?
Thanks a lot.
Carlos.
Fred Schenkelberg says
Hi Carlos, in this example, I didn’t talk about the PDF, yet if you do calculate the pdf value for time, say 365, you get the chance of failure on that day. Think of the PDF as a histogram and that the total of the area under the PDF curve must sum to 1. Check out Chris Jackson’s webinar on PDFs, CDFs etc… https://fred-schenkelberg-project.prev01.rmkr.net/accendo-webinars/accendo-reliability-webinar-series/pdfs-cdfs-and-other-fs/ for a much better description of what a PDF is and what it means. cheers, Fred
Andrew says
Thank you for putting time into giving such a great explanation.
After reading I was curious to know what would you do if you were calculating Lognormal distribution parameters with occurrences involving zero days?
Since x >O
Best
Fred Schenkelberg says
Zero does cause problems with such calculations, so you could set those events to 1 day, hour, or similar. The problem is they probably failed or were non functional, yet not turned or noticed as failed till that point we are calling point zero. Keeping the info in the dataset is a preference, yet I would evaluate with and without the zero failure events to determine if it materially changes the results – if not – leave them out, if it does cause a difference, then it’s time to sort out root causes and model based on individual mechanisms, if possible. cheers, Fred
Rolin says
thanks a lot for your explanations, if we had a problem like this,
a person arrives at a certain point morning. The waiting time, in minutes, to arrive is log normally distributed with 5 min mean and 1 min standard deviation. what is the probability of arriving within 3 min?
Fred Schenkelberg says
Hi Rolin,
solve the reliability function of the lognormal distribution. Set t to the duration of interest, here 3 min. Then set the mean and std. deviation parameters to the 5 and 1, respectively, and solve.
cheers,
Fred
Andrew Ghattas says
If you have a point-coordinate on the CDF and the scale parameter (or ln standard deviation), how do you calculate the location parameter (or ln mean)?