Accendo Reliability

Your Reliability Engineering Professional Development Site

  • Home
  • About
    • Contributors
    • About Us
    • Colophon
    • Survey
  • Reliability.fm
  • Articles
    • CRE Preparation Notes
    • NoMTBF
    • on Leadership & Career
      • Advanced Engineering Culture
      • ASQR&R
      • Engineering Leadership
      • Managing in the 2000s
      • Product Development and Process Improvement
    • on Maintenance Reliability
      • Aasan Asset Management
      • AI & Predictive Maintenance
      • Asset Management in the Mining Industry
      • CMMS and Maintenance Management
      • CMMS and Reliability
      • Conscious Asset
      • EAM & CMMS
      • Everyday RCM
      • History of Maintenance Management
      • Life Cycle Asset Management
      • Maintenance and Reliability
      • Maintenance Management
      • Plant Maintenance
      • Process Plant Reliability Engineering
      • RCM Blitz®
      • ReliabilityXperience
      • Rob’s Reliability Project
      • The Intelligent Transformer Blog
      • The People Side of Maintenance
      • The Reliability Mindset
    • on Product Reliability
      • Accelerated Reliability
      • Achieving the Benefits of Reliability
      • Apex Ridge
      • Field Reliability Data Analysis
      • Metals Engineering and Product Reliability
      • Musings on Reliability and Maintenance Topics
      • Product Validation
      • Reliability by Design
      • Reliability Competence
      • Reliability Engineering Insights
      • Reliability in Emerging Technology
      • Reliability Knowledge
    • on Risk & Safety
      • CERM® Risk Insights
      • Equipment Risk and Reliability in Downhole Applications
      • Operational Risk Process Safety
    • on Systems Thinking
      • Communicating with FINESSE
      • The RCA
    • on Tools & Techniques
      • Big Data & Analytics
      • Experimental Design for NPD
      • Innovative Thinking in Reliability and Durability
      • Inside and Beyond HALT
      • Inside FMEA
      • Institute of Quality & Reliability
      • Integral Concepts
      • Learning from Failures
      • Progress in Field Reliability?
      • R for Engineering
      • Reliability Engineering Using Python
      • Reliability Reflections
      • Statistical Methods for Failure-Time Data
      • Testing 1 2 3
      • The Manufacturing Academy
  • eBooks
  • Resources
    • Accendo Authors
    • FMEA Resources
    • Glossary
    • Feed Forward Publications
    • Openings
    • Books
    • Webinar Sources
    • Podcasts
  • Courses
    • Your Courses
    • Live Courses
      • Introduction to Reliability Engineering & Accelerated Testings Course Landing Page
      • Advanced Accelerated Testing Course Landing Page
    • Integral Concepts Courses
      • Reliability Analysis Methods Course Landing Page
      • Applied Reliability Analysis Course Landing Page
      • Statistics, Hypothesis Testing, & Regression Modeling Course Landing Page
      • Measurement System Assessment Course Landing Page
      • SPC & Process Capability Course Landing Page
      • Design of Experiments Course Landing Page
    • The Manufacturing Academy Courses
      • An Introduction to Reliability Engineering
      • Reliability Engineering Statistics
      • An Introduction to Quality Engineering
      • Quality Engineering Statistics
      • FMEA in Practice
      • Process Capability Analysis course
      • Root Cause Analysis and the 8D Corrective Action Process course
      • Return on Investment online course
    • Industrial Metallurgist Courses
    • FMEA courses Powered by The Luminous Group
    • Foundations of RCM online course
    • Reliability Engineering for Heavy Industry
    • How to be an Online Student
    • Quondam Courses
  • Calendar
    • Call for Papers Listing
    • Upcoming Webinars
    • Webinar Calendar
  • Login
    • Member Home
  • Barringer Process Reliability Introduction Course Landing Page
  • Upcoming Live Events
You are here: Home / Articles / No Evidence of Correlation: Field failures and Traditional Reliability Engineering

by Kirk Gray 6 Comments

No Evidence of Correlation: Field failures and Traditional Reliability Engineering

No Evidence of Correlation: Field failures and Traditional Reliability Engineering

Historically, Reliability Engineering of Electronics has been dominated by the belief that 1) The life or percentage of complex hardware failures that occur over time can be estimated, predicted, or modeled, and 2) the Reliability of electronic systems can be calculated or estimated through statistical and probabilistic methods to improve hardware reliability.  The amazing thing about this is that during the many decades that reliability

engineers have been taught this and believe that this is true, there is little if any empirical field data from the vast majority of verified failures that shows any correlation with calculated predictions of failure rates.

Probabilistic statistical predictions based on broad assumptions of the underlying physical causes began with the first electronics reliability prediction guide in November 1956, the RCA release TR-1100, “Reliability Stress Analysis for Electronic Equipment,” which presented models for computing rates of component failures. This publication was followed by the “RADC Reliability Notebook” in October 1959 and the publication of a military reliability prediction handbook format known as MIL-HDBK-217.

It continues today with various software applications, which are progenies of the MIL-HDBK-217. Underlying these “reliability prediction assessment” methods and calculations is the assumption that the main driver of unreliability is due to components with intrinsic failure rates moderated by the absolute temperature. It has been assumed that the component failure rates follow the Arrhenius equation and that component failure rates approximately double for every 10 °C.

MIL-HDBK-217 was removed from the military as a reference document in 1996 and has not been updated; it is still being referenced unofficially by military contractors and is still believed to have some validity even without any supporting evidence. [Note: as of the winter of 2024, I heard that Mil Hdbk 217 had been reactivated as a military standard – not updated, just pulled out of retirement.]

Much of the slow change in the industry is because electronics reliability engineering has a fundamental “knowledge distribution” problem in that actual field failure data, and the root causes of those failures can never be shared with the larger reliability engineering community. Reliability data is some of the most confidential and sensitive data a manufacturer has, and short of a court order, it will never be published. Without this actual data and information being disseminated and shared, one can expect little change in the beliefs of the vast majority of the electronics reliability engineering community.

Even though the probabilistic prediction approach to reliability has been practiced and applied for decades, any engineer who has seen the root causes of verified field failures will observe that most all failures that occur before the electronic system is technologically obsolete are caused by 1) errors in manufacturing 2) overlooked design margins 3) or accidental overstress or abuse by the customer.  The timing of the root causes of these failures, which are often driven by multiple events or stresses, is random and inconsistent. Therefore, there is no basis for applying statistical or probabilistic predictive methods. Most users of predictions have observed the non-correlation between estimated and actual failure rates.

It is long past time for the electronics design and manufacturing organizations to abandon these invalid and misleading approaches, acknowledge that reliability cannot be estimated from assumptions and calculations, and start using “stress to limits” to find latent failure mechanisms before a product is released to the market.  Indeed, you cannot derive a time to failure for most systems, but no test can provide an actual field “life” estimate for a complex electronic system, nor do we need to. There is more life than is needed in most electronics for most applications.

Fortunately, there is an alternative. A more pragmatic and practical approach is to find to put most engineering and testing resources to discover overlooked design margins or the weakest link early in the design process (HALT) and then use that strength and durability to screen (HASS) for errors during manufacturing quickly.  HALT and HASS have little to do with a specific type of chamber or chamber capabilities. It is a fundamental change in the frame of reference for reliability development, moving instead from time metrics to stress/limit metrics. Many have already realized this new frame of reference. Since they have found these methods much more efficient and cost-effective for developing robust electronic systems, they have a competitive advantage. They are not about to let the world or their competitors know how successful these methods are.

Filed Under: Articles, NoMTBF Tagged With: Field data analysis, Highly Accelerated Life Testing (HALT)

About Kirk Gray

My Passion for developing reliable products

Why did it fail?

This is the fundamental question that drove my career from first repairing electronics in the 1970’s to today. It was from this perspective that my passion for reliability engineering grew from investigating, discovering and understanding of why products fail. By starting with how electronics systems actually fail (empirical not theoretical) gave me a frame of reference to understand ways to rapidly discover failure mechanisms.

« Show All The Total Losses of Breakdowns
Parametric versus Non-Parametric Life Estimations – Sacrificing Reality for Elegance »

Comments

  1. Arjan van Druten says

    May 6, 2013 at 7:42 PM

    Good post! Keep on improving reliability by understanding why and how components and systems fail, what the worst scenarios are and eliminate the (root) hazards and/or protect against overload and system failures. Physics based Engineering approaches and HALT / HASS testing should be encouraged, not empirical Accounting studies (this is already done far too often and may result in “The numbers Game” and promotes re-active management…). Reliability for new, complex systems can not be predicted!

    Reply
    • Kirk Gray says

      May 6, 2013 at 8:07 PM

      Thanks Arjan for your comments, I think you “get it” and hope you keep spreading the best way to build reliable electronics!

      Reply
  2. Mike Sparks says

    March 2, 2015 at 7:39 AM

    Hi,
    Thanks for a thought-provoking article. I can easily agree than SR-332 predictions don’t match observed reliability from field data and I’ve seen several studies that show this to be the case. And I can also agree that wear-out isn’t important for electronics products, these days even the fans have an expected life that is longer that the product is likely to remain in service. On a related question, what is the evidence that Arrhenuis’ Law is or isn’t valid for electronics? I’d guess there is some relationship between temperature and failure rate, what is that relatinoship? Are there any studies or tests that cover this? It is an important question if you want to calcualte acceleration factors for an accelerated life test.
    Regards
    Mike

    Reply
    • Kirk Gray says

      March 2, 2015 at 9:35 AM

      Hi Mike, thanks for your comments and concurrence with my assertions.
      For certain there are physical failure mechanisms that have a chemical reaction element and therefore may have an Arrhenius law relationship.That being said, the vast majority of physical failure mechanisms in electronics at the system level have no relation to Arrhenius (ie.loose connectors, solder defects, via cracking) and it has been widely assumed and misapplied in reliability development. In many cases it has added unnecessary costs and possibly made a system less reliable. You can get a PDF copy of a paper by Michael Pecht and I on long term high temperature testing of PC’s here http://www.acceleratedreliabilitysolutions.com/images/Long-Term_Overstressing_of_Computers.pdf from my website. http://www.acceleratedreliabilitysolutions.com.
      You might also be interested in another paper too! It is written by the US Government and is public domain so please reprint and distribute widely – http://www.acceleratedreliabilitysolutions.com/images/Reliability_Predictions_Continued_Reliance_on_a_Misleading_Approach.pdf

      Reply
      • Larry George says

        July 11, 2024 at 11:18 AM

        Page not found. Try
        https://users.neo.registeredsite.com/8/4/9/23074948/assets/Reliability_Predictions_Continued_Reliance_on_a_Misleading_Approach.pdf
        C. Jais, B. Werner and D. Das, “Reliability predictions – continued reliance on a misleading approach,” 2013 Proceedings Annual Reliability and Maintainability Symposium (RAMS), Orlando, FL, USA, 2013, pp. 1-6, doi: 10.1109/RAMS.2013.6517751.

        Reply
    • Mitch Finne says

      April 18, 2024 at 12:29 PM

      The Arrhenuis equation should only be used in cases of thermodynamically driven phenomenon since that is what it describes via the ‘activation energy’ component. As Fred astutely pointed out, items like user or assembly issues (improperly seated connector) will not be properly described by an Arrhenuis relationship.

      Reply

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

The NoMTBF logo

Devoted to the eradication of the misuse of MTBF.

Photo of Fred SchenkelbergArticles by Fred Schenkelberg and guest authors

in the NoMTBF article series

Recent Posts

  • Gremlins today
  • The Power of Vision in Leadership and Organizational Success
  • 3 Types of MTBF Stories
  • ALT: An in Depth Description
  • Project Email Economics

Join Accendo

Receive information and updates about articles and many other resources offered by Accendo Reliability by becoming a member.

It’s free and only takes a minute.

Join Today

© 2025 FMS Reliability · Privacy Policy · Terms of Service · Cookies Policy