Accendo Reliability

Your Reliability Engineering Professional Development Site

  • Home
  • About
    • Contributors
    • About Us
    • Colophon
    • Survey
  • Reliability.fm
  • Articles
    • CRE Preparation Notes
    • NoMTBF
    • on Leadership & Career
      • Advanced Engineering Culture
      • ASQR&R
      • Engineering Leadership
      • Managing in the 2000s
      • Product Development and Process Improvement
    • on Maintenance Reliability
      • Aasan Asset Management
      • AI & Predictive Maintenance
      • Asset Management in the Mining Industry
      • CMMS and Maintenance Management
      • CMMS and Reliability
      • Conscious Asset
      • EAM & CMMS
      • Everyday RCM
      • History of Maintenance Management
      • Life Cycle Asset Management
      • Maintenance and Reliability
      • Maintenance Management
      • Plant Maintenance
      • Process Plant Reliability Engineering
      • RCM Blitz®
      • ReliabilityXperience
      • Rob’s Reliability Project
      • The Intelligent Transformer Blog
      • The People Side of Maintenance
      • The Reliability Mindset
    • on Product Reliability
      • Accelerated Reliability
      • Achieving the Benefits of Reliability
      • Apex Ridge
      • Field Reliability Data Analysis
      • Metals Engineering and Product Reliability
      • Musings on Reliability and Maintenance Topics
      • Product Validation
      • Reliability by Design
      • Reliability Competence
      • Reliability Engineering Insights
      • Reliability in Emerging Technology
      • Reliability Knowledge
    • on Risk & Safety
      • CERM® Risk Insights
      • Equipment Risk and Reliability in Downhole Applications
      • Operational Risk Process Safety
    • on Systems Thinking
      • Communicating with FINESSE
      • The RCA
    • on Tools & Techniques
      • Big Data & Analytics
      • Experimental Design for NPD
      • Innovative Thinking in Reliability and Durability
      • Inside and Beyond HALT
      • Inside FMEA
      • Institute of Quality & Reliability
      • Integral Concepts
      • Learning from Failures
      • Progress in Field Reliability?
      • R for Engineering
      • Reliability Engineering Using Python
      • Reliability Reflections
      • Statistical Methods for Failure-Time Data
      • Testing 1 2 3
      • The Manufacturing Academy
  • eBooks
  • Resources
    • Accendo Authors
    • FMEA Resources
    • Glossary
    • Feed Forward Publications
    • Openings
    • Books
    • Webinar Sources
    • Podcasts
  • Courses
    • Your Courses
    • Live Courses
      • Introduction to Reliability Engineering & Accelerated Testings Course Landing Page
      • Advanced Accelerated Testing Course Landing Page
    • Integral Concepts Courses
      • Reliability Analysis Methods Course Landing Page
      • Applied Reliability Analysis Course Landing Page
      • Statistics, Hypothesis Testing, & Regression Modeling Course Landing Page
      • Measurement System Assessment Course Landing Page
      • SPC & Process Capability Course Landing Page
      • Design of Experiments Course Landing Page
    • The Manufacturing Academy Courses
      • An Introduction to Reliability Engineering
      • Reliability Engineering Statistics
      • An Introduction to Quality Engineering
      • Quality Engineering Statistics
      • FMEA in Practice
      • Process Capability Analysis course
      • Root Cause Analysis and the 8D Corrective Action Process course
      • Return on Investment online course
    • Industrial Metallurgist Courses
    • FMEA courses Powered by The Luminous Group
    • Foundations of RCM online course
    • Reliability Engineering for Heavy Industry
    • How to be an Online Student
    • Quondam Courses
  • Calendar
    • Call for Papers Listing
    • Upcoming Webinars
    • Webinar Calendar
  • Login
    • Member Home
  • Barringer Process Reliability Introduction Course Landing Page
  • Upcoming Live Events
You are here: Home / Articles / Lognormal Probability Plots

by Dennis Craggs Leave a Comment

Lognormal Probability Plots

Lognormal Probability Plots

Introduction

In general, a statistical analysis of univariate data starts with a histogram. If the histogram doesn’t show a bell shape, the data probably does not follow a normal distribution. If the logarithm of the data plots as a normal histogram, then the data is lognormally distributed. Any statistical projections and parameter estimates are based on the normal distribution of the log of the data.  This article focuses on the lognormal distribution and the lognormal probability plot.

Applications:

A normal probability distribution is defined over the range $-(-\infty,\infty)-$ for any value of $-\mu-$ and $-\sigma-$. This means that there is a finite probability of data in the range $-(\infty,0]-$. However, sometimes physical constraints allow only positive data. Then a non-normal distribution is required for reasonable probability predictions.

Some examples of lognormal data are:

  • Customer usage data like vehicle trips/day, refrigerator open/close cycles, …
  • Time to repair of a maintainable system.
  • Particle size distributions in chemistry
  • Rainfall distributions
  • Biological measurements, i.e., blood pressures of males or females, size of appendages like teeth, hair, claws, …
  • Human behavior, i.e., dwell time on articles/jokes, length of chess games…

Fundamentals

In the histogram, figure 1, the x-values are right skewed. For this data,  $-\bar(x)=2.99-$ and $-s=1.51-$. The average is about 2 standard deviations larger than 0 so $-Pr(x\le 0)\approx 0.15-$. The data does not appear to be normally distributed because there are not any values $-\leq 0 and being right skewed. The lognormal should be checked.  

Figure 1

Using $-y=ln(x)-$ to transform the data, the histogram of Y is

Figure 2

The y histogram shows the symmetric (bell) shape that is characteristic of a normal distribution. With this information, one can assume the y-data is normally distributed. Statistical tests for normality can be used to verify normality at a specified confidence. 

Distributions

A quick analysis of the Y data shows $-\bar(x)=0.9757-$ and $-s=0.4976-$. Assuming $-\mu=1-$ and $-\sigma=0.5-$, then the probability density plot of y is

Figure 3

The lognormal probability density for the X data with lognormal parameters $-\mu=1-$ and $-\sigma=0.5-$ is:

Figure 4

This plot approximates the shape of the x histogram.

Distributions

For y, the normal probability density f(y) is:

$$\begin{array}{cc}f(y)=\frac{1}{\sigma\sqrt{2\pi}}e^{(x-\mu)^2/2\sigma^2} & for & y \in (-\infty,\infty)\end{array}$$

(1)

And the cumulative normal probability density F(y) is:

$$\begin{array}{cc}F(y)=\int_{-\infty}^y f(\phi)d\phi & for & y \in (-\infty,\infty)\end{array}$$

(2)

In equations 1 and 2, negative values of y are allowed.

For x, the lognormal probability density f(x) is:

$$\begin{array}{cc}f(y)=\frac{1}{x\sigma\sqrt{2\pi}}e^{(ln(x)-\mu)^2/2\sigma^2} & for & x \in [0,\infty)\end{array}$$

(3)

And the cumulative normal probability density F(x) is:

$$\begin{array}{cc}F(x)=\int_{-\infty}^x f(\phi)d\phi & for & x \in [o,\infty)\end{array}$$

(4)

In equations 3 and 4, only positive values of x are allowed. In equations 2 and 4, $-\phi-$ is a dummy variable of integration.

Percentiles

Using equation 2, the cumulative probability F(y) is the area under the probability density to the left of y. Alternatively, for any desired cumulative probability $-\alpha-$, the inverse $-y^\alpha-$ may be determined. There are two basic Excel function that support these calculations:

$$\alpha = F(y) = norm.dist(y,\mu,\sigma,true)$$

(5)

$$y_\alpha = norm.inv(\alpha,\mu,\sigma)$$

(6)

Equation 5 provides the cumulative probability of y under the distribution curve, i.e., to the left of y. Equation 6 is the inverse operation. For more details, see Excel help.

On probability density plots, important probabilities are  highlighted with a vertical line at the specified variable values. For example, if the analyst is interested in the 5th, 50th and 95th cumulative percentiles of the y data, then corresponding values of y are 0.18, 1.0 and 1.82:

Figure 5

The corresponding significance values for the probability density plot of x are obtained using the inverse of the natural logarithm transformation, i.e., $-x=e^{y_\alpha}-$. These x values are 1.19, 2.72, and 6.19, so vertical red bars were added to the x probability density plot.

 

Figure 6

This allows statistical statements to be made about significant x values.

Distribution estimates

The underlying population parameters are unknown, but can be estimated from data.  If a random variable  is lognormally distributed, then $-\mu-$ and $-\sigma-$ are directly estimated using

$$\mu\approx\bar{y}=\frac{1}{N}\sum_{i=1}^{i=n}ln(x)_i$$

(7)

And

$$\sigma^2\approx s_y^2=frac{1}{n-1}[ln(x_i)-\mu]^2$$

(8)

Confidence Intervals

Any sample data contains random error which limits our ability to determine the underlying population parameters. Confidence intervals that will contain the population parameters are obtained by:

  • Calculate the sample mean and standard deviation, using equations 7 and 8.
  • Specify the desired confidence, C. 
  • Split the uncertainty equally to both tails of the distribution into a lower significance limit,

$$\alpha_{LL}=\frac{(1-C)}{2}$$

(9)

And into an upper significance limit,

$$\alpha_{UL}=\frac{(1+C)}{2}$$

(10)

  • Calculate an interval containing the population mean,

$$ \bar{y}-\frac{s}{\sqrt{n}}t_{n-1,\alpha_{LL}} <\mu< \bar{y}-\frac{s}{\sqrt{n}}t_{n-1,\alpha_{UL)}} $$

(11)

  • The interval containing the population standard deviation is,

$$\frac{(n-1)s^2}{\chi_{n-1,\alpha_{UL}}} <\sigma^2 \frac{(n-1)s^2}{\chi_{n-1,\alpha_{LL}}}$$

(12)

These equations can be used to select sample sizes for the analysis.

Lognormal Probability Plot

The construction of the lognormal probability plot starts with a simple x-y scatter plot. The horizontal and vertical plotting coordinates need to be determined for each data value. The horizontal coordinate is the natural logarithm of the data. The vertical coordinate, the cumulative probability is calculated for each data value using the median rank and the plotting coordinate is based on the inverse of the standard normal. The process is:

  • Sort the x data from lowest to highest.
  • Assign a index number to each sorted data value starting from $-i=1-$ to $-i=n-$.
  • Calculate the median ranks of each point, i.e., $-MR(x_i)=(i-0.3)(n+0.4)-$. These cumulative probabilities are the median ranks $-P(x_i)=MR(i)-$.
  • Calculate the vertical axis plotting coordinates using $-F(i)=norm.inv(P(i))-$

If the data is lognormally distributed, the scatter plot will tend to follow a straight-line, figure 5.

Figure 7

There are some deviations from linearity at the ends and random variation about the trend line. However, the bulk of the data follows a linear trend.

Figure 5, which is basically a scatter plot, is rather difficult to interpret. But, using a logarithmic axis for the horizontal axis displays the original data values. The vertical axis is the inverse of the cumulative standard normal. Improve readability by calculating tick marks and labels based on the inverse of the cumulative standard normal for $-p = 0.01, 0.05, 0.1,\ldots, 0.9, 0.95, 0.99-$,and replacing the original y-scale.

Relabeling scales did not change the plotting coordinates of the sample data. Only the appearance of the axes are changed. The plot is now a lognormal probability plot, shown in figure 6.

Figure 8

Generally, probability plots are enhanced by adding a best fit trend line and confidence limits. Sometimes basic statistics, including a goodness of fit using numerical methods. There remains the question of how to treat incomplete (censored) data sets. For example, some items in a reliability test may have been terminated early for reasons other than part failure or test completion. Each of these topics may employ different methods so deserves a separate article. They are not discussed here.

Conclusions:

A lognormal probability plot is a scatter plot that uses a logarithmic horizontal scale and a standard normal inverse of the cumulative probability for the vertical axis. Data, that is lognormally distributed and plotted on lognormal probability paper, will tend to follow a straight line. The trend allows one to project the cumulative probabilities.

Dennis Craggs, Consultant

Quality, Reliability and Analytics Services

Filed Under: Articles, Big Data & Analytics, on Tools & Techniques Tagged With: Lognormal Distribution

About Dennis Craggs

I am a mechanical engineer, programmer, and statistician. My career spanned the aerospace, NASA and Teledyne CAE, and automotive, Ford and Chrysler, industries. After retirement, I started consulting as a reliability engineer and am writing articles on big data analytics. My primary goal is to assist young engineers and consult on product development issues.

« How to Estimate the Stochastic Performance of your Production System in 5 Minutes
Self Awareness »

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Big Data & Analytics series Article by Dennis Craggs

Join Accendo

Receive information and updates about articles and many other resources offered by Accendo Reliability by becoming a member.

It’s free and only takes a minute.

Join Today

Recent Articles

  • Gremlins today
  • The Power of Vision in Leadership and Organizational Success
  • 3 Types of MTBF Stories
  • ALT: An in Depth Description
  • Project Email Economics

© 2025 FMS Reliability · Privacy Policy · Terms of Service · Cookies Policy