Accendo Reliability

Your Reliability Engineering Professional Development Site

  • Home
  • About
    • Contributors
    • About Us
    • Colophon
    • Survey
  • Reliability.fm
  • Articles
    • CRE Preparation Notes
    • NoMTBF
    • on Leadership & Career
      • Advanced Engineering Culture
      • ASQR&R
      • Engineering Leadership
      • Managing in the 2000s
      • Product Development and Process Improvement
    • on Maintenance Reliability
      • Aasan Asset Management
      • AI & Predictive Maintenance
      • Asset Management in the Mining Industry
      • CMMS and Maintenance Management
      • CMMS and Reliability
      • Conscious Asset
      • EAM & CMMS
      • Everyday RCM
      • History of Maintenance Management
      • Life Cycle Asset Management
      • Maintenance and Reliability
      • Maintenance Management
      • Plant Maintenance
      • Process Plant Reliability Engineering
      • RCM Blitz®
      • ReliabilityXperience
      • Rob’s Reliability Project
      • The Intelligent Transformer Blog
      • The People Side of Maintenance
      • The Reliability Mindset
    • on Product Reliability
      • Accelerated Reliability
      • Achieving the Benefits of Reliability
      • Apex Ridge
      • Field Reliability Data Analysis
      • Metals Engineering and Product Reliability
      • Musings on Reliability and Maintenance Topics
      • Product Validation
      • Reliability by Design
      • Reliability Competence
      • Reliability Engineering Insights
      • Reliability in Emerging Technology
      • Reliability Knowledge
    • on Risk & Safety
      • CERM® Risk Insights
      • Equipment Risk and Reliability in Downhole Applications
      • Operational Risk Process Safety
    • on Systems Thinking
      • Communicating with FINESSE
      • The RCA
    • on Tools & Techniques
      • Big Data & Analytics
      • Experimental Design for NPD
      • Innovative Thinking in Reliability and Durability
      • Inside and Beyond HALT
      • Inside FMEA
      • Institute of Quality & Reliability
      • Integral Concepts
      • Learning from Failures
      • Progress in Field Reliability?
      • R for Engineering
      • Reliability Engineering Using Python
      • Reliability Reflections
      • Statistical Methods for Failure-Time Data
      • Testing 1 2 3
      • The Manufacturing Academy
  • eBooks
  • Resources
    • Accendo Authors
    • FMEA Resources
    • Glossary
    • Feed Forward Publications
    • Openings
    • Books
    • Webinar Sources
    • Podcasts
  • Courses
    • Your Courses
    • Live Courses
      • Introduction to Reliability Engineering & Accelerated Testings Course Landing Page
      • Advanced Accelerated Testing Course Landing Page
    • Integral Concepts Courses
      • Reliability Analysis Methods Course Landing Page
      • Applied Reliability Analysis Course Landing Page
      • Statistics, Hypothesis Testing, & Regression Modeling Course Landing Page
      • Measurement System Assessment Course Landing Page
      • SPC & Process Capability Course Landing Page
      • Design of Experiments Course Landing Page
    • The Manufacturing Academy Courses
      • An Introduction to Reliability Engineering
      • Reliability Engineering Statistics
      • An Introduction to Quality Engineering
      • Quality Engineering Statistics
      • FMEA in Practice
      • Process Capability Analysis course
      • Root Cause Analysis and the 8D Corrective Action Process course
      • Return on Investment online course
    • Industrial Metallurgist Courses
    • FMEA courses Powered by The Luminous Group
    • Foundations of RCM online course
    • Reliability Engineering for Heavy Industry
    • How to be an Online Student
    • Quondam Courses
  • Calendar
    • Call for Papers Listing
    • Upcoming Webinars
    • Webinar Calendar
  • Login
    • Member Home
  • Barringer Process Reliability Introduction Course Landing Page
  • Upcoming Live Events
You are here: Home / Articles / MTBF Correlation vs. Causation: MIL-HDBK-217G

by Larry George 1 Comment

MTBF Correlation vs. Causation: MIL-HDBK-217G

MTBF Correlation vs. Causation: MIL-HDBK-217G

People claim poor correlation of predicted and observed MTBFs. That is understandable because handbook failure rates and fudge factors for quality and environment were derived from unknown populations or samples. People also claim there is no basis for applying statistics or probability to MTBF predictions. MTBF predictions use failure rate averages that lack statistical causation. Why not incorporate Paretos in MTBF predictions?

Paretos are fractions of equipment failures caused by each type of part or subsystem. They represent what really happens. Incorporating Paretos requires statistics to adjust MTBF predictions. That causes Paretos in MTBF predictions to match field Paretos. A 1992 ASQ Reliability Review article “MIL-HDBK-217G” proposed using observed Paretos to adjust handbook MTBF predictions with a “Reality” factor.

Correlation of MTBF Prediction and Observed MTBF?

The RAMS article by Jais, Werner, and Das says, “The ratio of [MTBF] predictions to demonstrated [MTBF] values ranges from 1.2:1 to 218:1. This shows that original contractor [MTBF] predictions for DoD systems greatly exceed the demonstrated results. In addition, statistical analysis of the data using Spearman’s Rank Order Correlation Coefficient show that MIL-HDBK-217 based predictions cannot support comparisons between systems.” 

Kirk Gray’s April 2024 “Accendo Weekly Update” article claimed about electronics: “…there is little if any empirical field data from the vast majority of verified failures that shows any correlation with calculated predictions of failure rates.” “…actual field failure data, and the root causes of those failures can never be shared… Reliability data is some of the most confidential and sensitive data a manufacturer has.” 

Ancient History?

In December of 2000 Kirk Gray and Wayne Tustin wrote, “It is no longer possible or reasonable to even attempt statistical estimates of reliability [MTBF] based on a summation of components’ reliability [failure rates], even if accurate data on current components was available.” [“Don’t Let the Cost of HALT Stop You,” by Kirk Gray and Wayne Tustin, 2001.]

I wrote back, “Wayne [Tustin] kindly sends me his newsletters. I had the impertinence to dispute a few statements in his article with Kirk Gray.” Wayne invited me to expound at greater length. His tolerance is commendable. Kirk and Wayne advocate HALT and HASS. Their December 2000 criticism was representative. 

I replied that reliability statistics convert available data required by GAAP, even without lifetime data, into actionable information; information that helps decide whether to do anything and what to do, to what, when, and how much. 

“It is with great sadness that we announce the passing of Wayne Tustin, founder and former president of ERI. Wayne passed away on May 10, 2018 at the age of 95. Wayne, a Fellow of the IEST, was involved in teaching of Vibration and Shock testing for 70 years.” [equipment-reliability.com/]

Kirk replied, “Please tell him [Larry George] to go read the article on the web from University of Maryland’s CALCE titled “Why the traditional reliability prediction models do not work 
– is there an alternative?” by Michael Pecht. Pecht proposed Physics of Failures (PoF) [“Long-Term Overstressing of Computers,” Nov. 2011 by Kirk Gray and Mike Pecht]. 

I replied to Kirk and Wayne… I agree with their criticism of MTBF predictions, and PoF is fine for failure modes where the physics is well known. I’ll never forget an Nevil Shute novel [No Highway, 1948] about planes with wings that would fall off after exactly 11,296 hours. However, PoF is deterministic and won’t give you reliability, because reliability is probability of failure as a function of age. [See US DoT FAA AC No: 23.1309-lE reference for life limits based on fatigue life.] Milton Ohring’s book about PoF is comprehensive but largely deterministic. Handbooks that use PoF provide fudge factors based on unknown populations or samples; the same problem as with using handbook failure rates.

MTBF Prediction with Causation?

Why not incorporate causation into MTBF prediction? Generations of products have similar component, undergo similar manufacturing processes, have similar customers, and similar environments. Use observed Paretos [Schenkelberg, July 2024] to adjust handbook failure rates and MTBF predictions. MIL-HDBK-217G [George, June 1992] incorporates causation by making MTBF predictions have the same Paretos as observed Paretos. Adjust MTBF predictions with a reality factors pR for each type of component in the MTBF prediction.

The MIL-HDBK-217F MTBF prediction is 1/∑Ni(λG*πQ)I, where Ni is number of component i, λG is generic failure rate for component i, and πQ is its quality factor. The sum Is from 1 to n, the number of generic part categories.

The MIL-HDBK-217G MTBF prediction is 1/∑Ni(λG*πQ*πR)I, where πR is a part reality factor based on Paretos! The reality factor adjusts the equipment reliability predictions by adjusting some parts’ failure fractions in the direction of field failure Paretos. 

The MIL-HDBK-217G MTBF prediction causes a relation with the real MTBF, because it uses Paretos of new product’s parts in common with past Paretos. Causation helps make the MIL-HDBK-217G MTBF prediction closer to the real MTBF, because generations of products have similar parts’ Paretos.

Figure 1. Predicted and observed Paretos for two switches: The MIL-HDBK-217G MTBF predictions for the 2400 and 2800 products were close except for motherboard.

Math

Compute the ratio of each part’s failure fraction and its predicted failure fraction, 

Ni(λG*πQ)I/λEQUIP, for the equipment observed in the field. If the ratio (“Percent” in table 1) exceeds the observed fraction of equipment failures, Pareto (“P(i)” in table 1) , then πR = 1. If the ratio is less than P(i) , then reorder the parts so the parts with Pareto greater than predicted failure fraction, on the equipment observed in the field, is greater than 1. Let k denote the number of parts for which this is true, and let n denote the total number of parts on the equipment observed in the field, k < n. 

The reality factor for the i-th part type is 

$$ \displaystyle \pi_{R}=\left[A^{-1}b\right]_{i}\diagup\left(\lambda_{G}\pi_{Q}\right)_{i} $$

where [A-1 b]i is the i-th element of the vector [A-1 b], and A is the matrix in table 1 and b is the vector [P(1),P(2),…,P(k)] transposed times ∑(λG*πQ)I where the sum is from k+1 to n. The reality factor adjusts the equipment reliability prediction by adjusting some parts’ failure fractions upward, in the direction of field failure experience. Naturally this increases the equipment reliability prediction.

Table 1. Matrix of the ratios of part failure rates divided by equipment failure rates, P(i)=Ni(λGπQ)i/λEQUIP.   

A-matrix   b
1-P(1)-P(1)Etc.-P(1)P(1)*∑(λG*πQ)i
-P(2)1-P(2)-P(2)-P(2)P(2)*∑(λG*πQ)i
Etc.Etc.Etc.Etc.Etc.
-P(k)-P(k)Etc.1-P(k)P(k)*∑(λG*πQ)i

Table 2. Spreadsheet implementation. This is the data before MTBF adjustment. List parts in decreasing order of their Paretos P(i). MTBF is inverse of sum of parts’ Ni(λG*πQ)i. Percent is each Ni(λG*πQ)I divided by their sum.  

PartCountPart failure rate λGNi(λG πQ)iPercentPareto P(i)
111135.09%40.00%
210.20.27.02%10.00%
310.10.13.51%10.00%
410.20.27.02%10.00%
510.150.155.26%10.00%
610.250.258.77%4.00%
710.30.310.53%4.00%
810.250.258.77%4.00%
910.150.155.26%4.00%
1010.250.258.77%4.00%
Sum  2.85100%100%
  MTBF350.9  

Table 3. Spreadsheet implementation. Compute the reality factors and consequent adjusted MTBF prediction. The Reality Factor for part-type i is 

[A-1*b]i/(λG*πQ)I or 1.0 if the part’s Pareto(i)/Percent(i) is greater than 1.0. The Excel formulas for the “Reality Factors” are MMULT(A27:E27,b)/C6, MMULT(A28:E28,b)/C7, etc… where A27:E27 is the first row of A-1and C6 is the part 1 failure rate.   

PartλGNi(λG*πQ)iReality FactorNew total
1112.402.4
20.20.23.000.6
30.10.16.000.6
40.20.23.000.6
50.150.154.000.6
60.250.251.000.25
70.30.31.000.3
80.250.251.000.25
90.150.151.000.15
100.250.251.000.25
Fail rate2.852.85 6
MTBFOld350.9Adjusted166.7

 Table 4. The A matrix and the b vector. The A matrix is from the parts where Paretos exceed failure rates.

A    b
60%-40%-40%-40.00%-40.00%0.48
-10%90%-10%-10.00%-10.00%0.12
-10%-10%90%-10.00%-10.00%0.12
-10%-10%-10%90%-10.00%0.12
-10%-10%-10%-10%90.00%0.12
A-1    A-1*b
322222.4
0.51.50.50.50.50.6
0.50.51.50.50.50.6
0.50.50.51.50.50.6
0.50.50.50.51.50.6

If the formula for the reality factor seems daunting, ask pstlarry@yahoo.com for the spreadsheet. Or send your field Paretos, parts count and reliability predictions for the parts on the equipment. I’ll compute the reality factors and send them back. 

Correlation vs. Causation?

Correlation is between random variables such as times to failures, not between an MTBF prediction (number) and an observed MTBF (random variable). An MTBF prediction is not a random variable, even though a sample MTBF is a random variable. (Jain, Werner, and Das gathered predicted and test sample MTBFs to compute the correlation reported in their RAMS article. They did not explain how they computed test sample MTBF.) 

The MIL-HDBK-217G MTBF prediction using Paretos is a random variable that depends on the distributions of the Paretos (proportions). It is legitimate to ask for the correlation of the MIL-HDBK-217G MTBF prediction and the subsequent field MTBF. It depends on how well the old Paretos match field Paretos (new) and how much the predicted and field MTBFs depend on known Paretos. 

For a one-part product, correlation of two Binomial distributions, Corr(old Pareto, field Pareto)=0. For a two-part product, Corr(Beta(a1, a2), Beta(b1, b2)) can be computed from the BetaBinomial(n2, a, b) distribution, where old Pareto has Beta[a, b] distribution. For more than two-part products, the joint distribution of Paretos are Dirichlet distributions. 

Field MTBF = Total Time on Test, TTT/|Failures|where TTT= (∑(Yi) + (n-r)Yr)/r, where Yr is the end of test time and n-r is the number of survivors [Teyim Karibo, LinkedIn ASQ RRD post July 2, 2024]. Field MTBF is Limited by Life of Oldest Failure and is biased low. Same is true of test MTBF.

Reliability Prediction is not MTBF Prediction

Reliability is a function, of age, not a number like MTBF. Credible reliability predictions are harder to make than MTBF predictions. Fortunately, credible reliability predictions that incorporate causation resemble age-specific field reliability, because many products, parts, designs, production, shipping, installations, customers, and environments are the same generation after generation.

I proposed using nonparametric, age-specific, field failure rates (not MTBFs) scaled by the ratios of MTBF(new)/MTBF(old) for reliability predictions, such as for the products in figure 1. Nonparametric estimates of field reliability and failure rate functions are available, without lifetime data, even for recurrent processes, from ships and returns counts required by GAAP!

Harold Williams, ASQ Reliability Review editor nicknamed my “Credible Reliability Prediction” monograph as “CRP”. CRP doesn’t predict MTBF, it predicts new-product reliability functions, using observed field reliabilities of parts’, ratios of old and new MTBFs, and new-product reliability block diagram or “structure function”. Fred Schenkelberg invited me to HP to explain CRP to his colleagues.

CRP predicts the age-specific failure rate functions for each component or subsystem: λ(t; i)= λ(t; i; old)*EXP[MTBF(old)/MTBF(new)] because all that are known at time of reliability prediction: λ(t; i; old), observed MTBF(old), and predicted MTBF(new). (Don’t use Paretos in MTBF(old).) This is known as a proportional hazards model. The Credible MTBF prediction is 

CRP MTBF = 1/(∑Ni(λ(t; i; old)*(EXP[MTBF(old)/MTBF(new)]*πQ)I)

Causation is not correlation, but incorporating Paretos helps make credible MTBF and reliability predictions closer to reality, by incorporating causation.

References

L. L. George, “MIL-HDBK-217G (George),” ASQ Reliability Review, Vol. 12, no. 3, June 1992

L. L. George, “Credible Reliability Prediction,” 2nd edition, https://drive.google.com/file/d/1vxzrQUQKciZ1uyB1ZF_O-m4VcK6oVZe8/view/, 2023

L. L. George, “User Manual for Credible Reliability Prediction,” https://drive.google.com/file/d/1za5KT_qsF2sCSzGO7xi2EoHONBz2PwtZ/view/, 2023

Kirk Gray, “No Evidence of Correlation: Field failures and Traditional Reliability Engineering,”  https://nomtbf.com/2012/02/no-evidence-of-correlation-field-failures-and-traditional-reliability-engineering/, Feb. 2012

Kirk Gray, “No Evidence of Correlation: Field failures and Traditional Reliability Engineering,” Weekly Update, https://fred-schenkelberg-project.prev01.rmkr.net/no-evidence-of-correlation-field-failures-and-traditional-reliability-engineering/,April 21, 2024,

Christopher Jais, Benjamin Werner, and Diganta Das, “Reliability Predictions – Continued Reliance on a Misleading Approach,” 2013 Proceedings Annual Reliability and Maintainability Symposium (RAMS), Orlando, FL, USA, pp. 1-6, Jan. 2013

Milton Ohring, Reliability and Failure of Electronic Materials and Devices; 2nd Edition – October 14, 2014

Fred Schenkelberg, “Field Data and Reliability,” Weekly Update, https://fred-schenkelberg-project.prev01.rmkr.net/field-data-reliability/

Fred Schenkelberg, “Fundamentals of Pareto Analysis,” Webinar, https://fred-schenkelberg-project.prev01.rmkr.net/accendo-courses/accendo-reliability-webinar-series/lessons/quality/topic/fundamentals-of-pareto-analysis/, July 2024

Fred Schenkelberg, “Who are you Fooling with MTBF Predictions?” Accendo Weekly Update, https://fred-schenkelberg-project.prev01.rmkr.net/reliabilty-predictions/, August 2024 

US DoT FAA, “System Safety Analysis and Assessment for Part 23 Airplanes,” AC No: 23.1309-lE 11/17/2011

Filed Under: Articles, on Tools & Techniques, Progress in Field Reliability?

About Larry George

UCLA engineer and MBA, UC Berkeley Ph.D. in Industrial Engineering and Operations Research with minor in statistics. I taught for 11+ years, worked for Lawrence Livermore Lab for 11 years, and have worked in the real world solving problems ever since for anyone who asks. Employed by or contracted to Apple Computer, Applied Materials, Abbott Diagnostics, EPRI, Triad Systems (now http://www.epicor.com), and many others. Now working on actuarial forecasting, survival analysis, transient Markov, epidemiology, and their applications: epidemics, randomized clinical trials, availability, risk-based inspection, Statistical Reliability Control, and DoE for risk equity.

« Reliability Sampling Plans Part-1 (Basic Concepts)
What Does it take to Create a Reliability Culture?  »

Comments

  1. Larry George says

    September 24, 2024 at 3:59 PM

    I congratulate Kirk Gray John Paschkewitz on their success with testing and HALT plans. Testing is necessary.
    I apologize for criticizing complaints about MTBF prediction correlation with field MTBF.
    Bad MTBF predictions deserve criticism.

    Reply

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Articles by Larry George
in the Progress in Field Reliability? article series

Join Accendo

Receive information and updates about articles and many other resources offered by Accendo Reliability by becoming a member.

It’s free and only takes a minute.

Join Today

Recent Articles

  • Gremlins today
  • The Power of Vision in Leadership and Organizational Success
  • 3 Types of MTBF Stories
  • ALT: An in Depth Description
  • Project Email Economics

© 2025 FMS Reliability · Privacy Policy · Terms of Service · Cookies Policy