Accendo Reliability

Your Reliability Engineering Professional Development Site

  • Home
  • About
    • Contributors
    • About Us
    • Colophon
    • Survey
  • Reliability.fm
  • Articles
    • CRE Preparation Notes
    • NoMTBF
    • on Leadership & Career
      • Advanced Engineering Culture
      • ASQR&R
      • Engineering Leadership
      • Managing in the 2000s
      • Product Development and Process Improvement
    • on Maintenance Reliability
      • Aasan Asset Management
      • AI & Predictive Maintenance
      • Asset Management in the Mining Industry
      • CMMS and Maintenance Management
      • CMMS and Reliability
      • Conscious Asset
      • EAM & CMMS
      • Everyday RCM
      • History of Maintenance Management
      • Life Cycle Asset Management
      • Maintenance and Reliability
      • Maintenance Management
      • Plant Maintenance
      • Process Plant Reliability Engineering
      • RCM Blitz®
      • ReliabilityXperience
      • Rob’s Reliability Project
      • The Intelligent Transformer Blog
      • The People Side of Maintenance
      • The Reliability Mindset
    • on Product Reliability
      • Accelerated Reliability
      • Achieving the Benefits of Reliability
      • Apex Ridge
      • Field Reliability Data Analysis
      • Metals Engineering and Product Reliability
      • Musings on Reliability and Maintenance Topics
      • Product Validation
      • Reliability by Design
      • Reliability Competence
      • Reliability Engineering Insights
      • Reliability in Emerging Technology
      • Reliability Knowledge
    • on Risk & Safety
      • CERM® Risk Insights
      • Equipment Risk and Reliability in Downhole Applications
      • Operational Risk Process Safety
    • on Systems Thinking
      • Communicating with FINESSE
      • The RCA
    • on Tools & Techniques
      • Big Data & Analytics
      • Experimental Design for NPD
      • Innovative Thinking in Reliability and Durability
      • Inside and Beyond HALT
      • Inside FMEA
      • Institute of Quality & Reliability
      • Integral Concepts
      • Learning from Failures
      • Progress in Field Reliability?
      • R for Engineering
      • Reliability Engineering Using Python
      • Reliability Reflections
      • Statistical Methods for Failure-Time Data
      • Testing 1 2 3
      • The Manufacturing Academy
  • eBooks
  • Resources
    • Accendo Authors
    • FMEA Resources
    • Glossary
    • Feed Forward Publications
    • Openings
    • Books
    • Webinar Sources
    • Podcasts
  • Courses
    • Your Courses
    • Live Courses
      • Introduction to Reliability Engineering & Accelerated Testings Course Landing Page
      • Advanced Accelerated Testing Course Landing Page
    • Integral Concepts Courses
      • Reliability Analysis Methods Course Landing Page
      • Applied Reliability Analysis Course Landing Page
      • Statistics, Hypothesis Testing, & Regression Modeling Course Landing Page
      • Measurement System Assessment Course Landing Page
      • SPC & Process Capability Course Landing Page
      • Design of Experiments Course Landing Page
    • The Manufacturing Academy Courses
      • An Introduction to Reliability Engineering
      • Reliability Engineering Statistics
      • An Introduction to Quality Engineering
      • Quality Engineering Statistics
      • FMEA in Practice
      • Process Capability Analysis course
      • Root Cause Analysis and the 8D Corrective Action Process course
      • Return on Investment online course
    • Industrial Metallurgist Courses
    • FMEA courses Powered by The Luminous Group
    • Foundations of RCM online course
    • Reliability Engineering for Heavy Industry
    • How to be an Online Student
    • Quondam Courses
  • Calendar
    • Call for Papers Listing
    • Upcoming Webinars
    • Webinar Calendar
  • Login
    • Member Home
  • Barringer Process Reliability Introduction Course Landing Page
  • Upcoming Live Events
You are here: Home / Articles / The “What” & “Why” of Weibull Analysis

by Fred Schenkelberg 5 Comments

The “What” & “Why” of Weibull Analysis

The “What” & “Why” of Weibull Analysis

How to conduct a Weibull analysis and the questions the analysis will generate.

Part 2 of 7

Every failure is part of a puzzle. The equipment we are maintaining is trying to communicate with the use of each and every failure.

Often the message is not obvious.

There is the immediate failure. And, if we’re paying attention we can sort out the root cause of the failure along with replacing or repairing the damaged parts. Sometimes though the damage is caused by another issue with the system.

Something was hidden.

Keep in mind that for any complex system there are thousands of possible ways it can manifest a failure. From alignment errors to lubrication mistakes to material degradation or wear, there are clues and indications in every failure.

The time element of data

As you track and record your team’s corrective maintenance activities, you are also gathering essential information to learn about your equipment. Beyond the components involved in the repair, what else did you learn:

  • The time of the failure (reported)
  • The failure mode or symptoms
  • The root cause, possibly
  • The corrective action taken to restore the system

I would say the most important element learned is the time of the failure. The time of day isn’t all the important, yet the time since installation or since the last failure involving that part is vital.

As you know, there are different ways, say, a motor can fail. It could be defective or damaged when installed, and it will likely fail shortly after installation. It could be accumulating damage and slowing losing its ability to operating under peak loads. Or, the bearings may be wearing out.

In each case, the remedy, to really fix the issue is different. We need to look at the repair data including the time to failure information.

The data analysis

Given any set of data, the first step in the analysis is nearly always a plot. No different when given time to failure data.

Let’s say we have 10 identical motors (same size and from the same vendor) installed across the production line. We know when each motor was placed into service.

Five of the motors have failed and have been replaced. Five have not. A histogram, boxplot or timeline may provide some information. My choice for this type of data is a Weibull cumulative distribution function (CDF) plot.

The Weibull CDF plot is on a log-log set of scales. The horizontal axis is time (could be cycles, operating or calendar time, etc.). The vertical access is the probability of failure, from near zero to 1, often we use 0.01 to 0.99 indicating a 1% to 99% chance of failure.

To plot the data we need to know how long each motor has been in operation. From installation to failure or till the last time we knew the motor was still working (we call this right censoring as we do not know when it will fail in the future, just that it is still running at this point).

Therefore, if one of our motors of the five that have failed, failed 100 days after installation, we will plot the first point above the 100-day point on the x-axis.

One way to do this is to estimate the CDF (or the cumulative percent failure). Intuitively we could use 100 ( i / n ) with my failures out of n units under test. Thus if with 10 motors, the first failure (i=1) time would be plotted at the 10% point on the vertical axis. This method is generally an overestimate or biased.

The approximate median rank estimate is generally accepted as addressing the bias adequately and relatively simple to use. For each time ti, of the i-th failure, calculate the CDF or percentile using 100 ( i – 0.3 ) / n + 0.4 ).

If we have 10 units that have failed out of 10 units or complete data that first point plotted would be at 6.73% and the time of the first failure. And, the 10th point would be at 93.3% and the time of the last failure.

If the 5 failures were from a group of 10 units, as in our motor example, then 5 of the motors are right censored. Using the median rank estimate formula the first point would be at 6.73% and the time of the first failure., in this case, 100 days.

The 5th point would be at 45.19% and the time of the fifth failure.

The Weibull Analysis basics

After plotting the points on a log-log scale, properly accounting for the motors that have not yet failed (right censored).

We have a few dots on a graph. What does this tell us?

Without getting into regression algorithms, like least squares, median ranks, or maximum likelihood, we can simply take out our ruler and align a best-fit line to the data. Draw a straight line that generally describes the location of the data.

If the line fits the data as a straight line (not a convex or concave pattern to the data points) then we may have data that can be described by the Weibull distribution. Basically, if the data describes a straight line as plotted on a log-log graph, the Weibull distribution may adequately describe the time to failure data.

This has a few advantages since the slope of the line provides information concerning the nature of the failures. It may help us in an investigation concerning the motor failures. Plus, the plot provides a crude indication of the chance of failure over time for the remaining motors.

The beta (β) value is called the shape parameter and describes the shape of the distribution, think histogram. It ranges from describing data with a decreasing failure rate over time, β <1, to a data with an increasing failure rate, β >1. When β =1 the Weibull distribution exactly equals an Exponential distribution and describes a constant failure rate (which is very rare).

If the slope is less than one, the likely causes are faulty motors out of the box, shipping or installation damage, improper installation or similar. A detailed failure analysis may reveal the issue only impacts a small percentage of all motors or impacts all motors, yet the longer the motors run, the less chance that this particular failure mechanism will occur.

I should mention that if the data does describe straight line it generally means there is a single failure mechanism involved. If the line is not straight, it is likely there are two or more underlying causes of the failures.

With only five failures, you should rely on root cause analysis as the plot itself may be misleading.

Now, if the slope of the fitted line is greater than one, it indicates an increased chance of failure over time. This is wear out.

The causes are as you would suspect- wear, corrosion, drift, accumulated damage, etc.

What action do we take?

We gather time to failure data, account for censored data, plot data and fit a line.

The analysis considers the slope of the line (if straight) and gleans a few clues about the source of the failures.

That’s not all we can learn from the analysis. We can also make a decision about the remaining motors still operating in the plant. Based on the slope of the fitted line, we can decide to leave the remaining motors on the line in operation or plan to conduct some preventative maintenance to replace the motors likely to fail soon.

If the slope is less than one, the remaining motors have a reduced chance of failure than previously. Leave them in operation as replacing them will only increase the chance of failure. Of course, over time other failure mechanisms will appear, so continue to monitor the time to failure data.

If the slope is greater than one, the remaining motors are likely wearing out. This means these motors have an increased chance of failure with time.

Thus, depending on the steepness of the slope, you can project the probability of failure for the remaining units.

If the cost of unplanned downtime is high and replacing a failed motor is expensive, plan on replacing the motors before they fail (or the probability of failure becomes unacceptable.)

Summary

For non-repairable data, a Weibull analysis is a great way to visualize and understand the time to failure data you likely already have available.

There are various software packages and tools available to make the fitting the line process more accurate, yet the basics of simply plotting the data can be done on log paper or via your favorite spreadsheet.

Plot the data – a great first step with any data analysis. Learn to read these plots and take appropriate action to improve your program.


 

Fred Schenkelberg is an experienced reliability engineering and management consultant with his firm FMS Reliability. His passion is working with teams to create cost-effective reliability programs that solve problems, create durable and reliable products, increase customer satisfaction, and reduce warranty costs. If you enjoyed this article, consider subscribing to the ongoing series at Accendo Reliability.

—


Use Maintenance Data cover
All seven articles in this series in one short ebook, interested?

Please login with your site registration to download this ebook which includes all seven articles in this series.

[popup type=”” link_text=”Login” link_class=”button”]

[/popup]

 

If you haven’t registered, it’s free and takes only a moment.

Join Accendo Reliability


The other articles in the series include:

Post 1 – Using the Maintenance Data You Already Have

Post 2 – The What & More Importantly, The Why of the Weibull Analysis

Post 3 – Quantify the Improvements (or Gaps) In Your Reliability

Post 4 – First Step in Analyzing Repairable Systems Data

Post 5 – The Next Step in Your Failure Data

Post 6 – The Next Step in Your Data Analysis

Post 7 – Data Q&A with Fred & James

 

References:

Fred Schenkelberg- accendoreliability.com/about/fred-schenkelberg/

FMS Reliability www.fmsreliability.com

Accendo Reliability accendoreliability.com/musings/

New Weibull Handbook http://geni.us/Weibull

Filed Under: Articles, Maintenance and Reliability, on Maintenance Reliability Tagged With: Regression analysis (Weibull analysis)

About Fred Schenkelberg

I am the reliability expert at FMS Reliability, a reliability engineering and management consulting firm I founded in 2004. I left Hewlett Packard (HP)’s Reliability Team, where I helped create a culture of reliability across the corporation, to assist other organizations.

« Identifying Parts Obsolescence Early
Tracking Large Rotating Equipment »

Comments

  1. Ciprian Coroian-Szellyes says

    July 29, 2016 at 12:15 AM

    I believe “Weibull” could be the second most abused and misused word (after MTBF) when it comes to Reliability. I see it sometimes in Job ads and if I take it in the given context with regard to the role, it hurts. Weibull analysis, if well interpreted and implemented, is a very powerful tool. This article shows the way in a compact and comprehensive mode.

    Reply
    • Fred Schenkelberg says

      July 29, 2016 at 7:35 AM

      Hi Ciprian,

      Using a screwdriver as a hammer comes to mind. Yes, there are plenty of misused words and concepts in reliability engineering. I agree that using Weibull to imply data analysis, in general, is not very accurate.

      I’d vote for HALT as the second most abused after MTBF… (may have to run a poll. 😉

      Cheers,

      Fred

      Reply
  2. Bill Meeker says

    October 30, 2016 at 9:57 AM

    Fred:

    Nice article. Two things I would like to note.

    You imply that a Weibull analysis is typically done on log-log paper (in the old days we would go down to the stock room to get copies of Weibull and lognormal paper—today of course it is all in our software). Actually, “Weibull paper” uses a log transformation for time and log(-log(1-p)) for the failure probabilities.

    I have seen certain organizations/individuals who think that the Weibull distribution is the only useful distribution to describing failure-time distributions. It is important to recognize that other distributions are often more appropriate and that there are sometimes mechanistic (not to mention data-driven) reasons for preferring one of the other distributions (e.g., lognormal).

    Good software that provides tools for reliability data analysis will make it easy to compare alternatives and have many advantages over using a spreadsheet or paper.

    Bill

    Reply
    • Fred Schenkelberg says

      October 31, 2016 at 11:07 AM

      Thanks for the comments Bill, much appreciated.

      I agree that Weibull is not the only tool, yet it often where I start when exploring the data. True, using the right distribution is key to gleen meaningful information from the data.

      Cheers,

      Fred

      Reply
      • Larry George says

        April 22, 2020 at 1:33 PM

        “I would say the most important element learned is the time of the failure. …yet the time since installation or since the last failure involving that part is vital.” [Fred]
        No it isn’t. Ships and returns counts are statistically sufficient to make nonparametric (and Weibull) estimates of field reliability and failure rate functions, repairable or dead-forever. Don’t believe it? Send data to pstlarry@yahoo.com.

        Reply

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Maintenance & Reliability series

Article by Fred Schenkelberg

Join Accendo

Receive information and updates about articles and many other resources offered by Accendo Reliability by becoming a member.

It’s free and only takes a minute.

Join Today

© 2025 FMS Reliability · Privacy Policy · Terms of Service · Cookies Policy