Accendo Reliability

Your Reliability Engineering Professional Development Site

  • Home
  • About
    • Contributors
    • About Us
    • Colophon
    • Survey
  • Reliability.fm
  • Articles
    • CRE Preparation Notes
    • NoMTBF
    • on Leadership & Career
      • Advanced Engineering Culture
      • ASQR&R
      • Engineering Leadership
      • Managing in the 2000s
      • Product Development and Process Improvement
    • on Maintenance Reliability
      • Aasan Asset Management
      • AI & Predictive Maintenance
      • Asset Management in the Mining Industry
      • CMMS and Maintenance Management
      • CMMS and Reliability
      • Conscious Asset
      • EAM & CMMS
      • Everyday RCM
      • History of Maintenance Management
      • Life Cycle Asset Management
      • Maintenance and Reliability
      • Maintenance Management
      • Plant Maintenance
      • Process Plant Reliability Engineering
      • RCM Blitz®
      • ReliabilityXperience
      • Rob’s Reliability Project
      • The Intelligent Transformer Blog
      • The People Side of Maintenance
      • The Reliability Mindset
    • on Product Reliability
      • Accelerated Reliability
      • Achieving the Benefits of Reliability
      • Apex Ridge
      • Field Reliability Data Analysis
      • Metals Engineering and Product Reliability
      • Musings on Reliability and Maintenance Topics
      • Product Validation
      • Reliability by Design
      • Reliability Competence
      • Reliability Engineering Insights
      • Reliability in Emerging Technology
      • Reliability Knowledge
    • on Risk & Safety
      • CERM® Risk Insights
      • Equipment Risk and Reliability in Downhole Applications
      • Operational Risk Process Safety
    • on Systems Thinking
      • Communicating with FINESSE
      • The RCA
    • on Tools & Techniques
      • Big Data & Analytics
      • Experimental Design for NPD
      • Innovative Thinking in Reliability and Durability
      • Inside and Beyond HALT
      • Inside FMEA
      • Institute of Quality & Reliability
      • Integral Concepts
      • Learning from Failures
      • Progress in Field Reliability?
      • R for Engineering
      • Reliability Engineering Using Python
      • Reliability Reflections
      • Statistical Methods for Failure-Time Data
      • Testing 1 2 3
      • The Manufacturing Academy
  • eBooks
  • Resources
    • Accendo Authors
    • FMEA Resources
    • Glossary
    • Feed Forward Publications
    • Openings
    • Books
    • Webinar Sources
    • Podcasts
  • Courses
    • Your Courses
    • Live Courses
      • Introduction to Reliability Engineering & Accelerated Testings Course Landing Page
      • Advanced Accelerated Testing Course Landing Page
    • Integral Concepts Courses
      • Reliability Analysis Methods Course Landing Page
      • Applied Reliability Analysis Course Landing Page
      • Statistics, Hypothesis Testing, & Regression Modeling Course Landing Page
      • Measurement System Assessment Course Landing Page
      • SPC & Process Capability Course Landing Page
      • Design of Experiments Course Landing Page
    • The Manufacturing Academy Courses
      • An Introduction to Reliability Engineering
      • Reliability Engineering Statistics
      • An Introduction to Quality Engineering
      • Quality Engineering Statistics
      • FMEA in Practice
      • Process Capability Analysis course
      • Root Cause Analysis and the 8D Corrective Action Process course
      • Return on Investment online course
    • Industrial Metallurgist Courses
    • FMEA courses Powered by The Luminous Group
    • Foundations of RCM online course
    • Reliability Engineering for Heavy Industry
    • How to be an Online Student
    • Quondam Courses
  • Calendar
    • Call for Papers Listing
    • Upcoming Webinars
    • Webinar Calendar
  • Login
    • Member Home
  • Barringer Process Reliability Introduction Course Landing Page
  • Upcoming Live Events
You are here: Home / Articles / The Root Cause of a Failure is Always a Decision

by Christopher Jackson Leave a Comment

The Root Cause of a Failure is Always a Decision

The Root Cause of a Failure is Always a Decision

We often get sucked into drawn-out conversations (or heated debates) about the ‘true’ meaning of words. Especially when it comes to sports. Was James Harden (a basketball player) in the ‘act of shooting’ when he was fouled? It matters – because if the answer is ‘yes’ he gets up to three free throws. So what does the ‘act of shooting’ mean and who decides it? There will be endless debate over beers about what this means. Perhaps largely dependent on which team you support.

At the end of the day, it usually doesn’t matter. You can debate it as much as you want, but the referees have already decided what happened on the court. It is done. It is over. You can disagree with them. But nothing changes the score.

Words are important.

There is a difference between ‘taxonomy’ and ‘semantics.’ ‘Taxonomy’ is all about naming things and phenomena to make sure there is a common understanding about what those things are. We engineers do this a lot. Or think we do. ‘Semantics’ is a branch of linguistics that focuses on the meaning of words and phrases. If we have got our ‘taxonomy’ down … there should be no room for ‘semantics.’

In practice, this is not the case. One of my favorite bits of nonsense that routinely appears in the world of reliability engineering is …

… a ‘failure free’ period – which is a period in which the probability of failure is some small value.

I have wasted many hours of my life waiting for people to debate over the meaning of words and terms that should not be debatable. What is a ‘fault?’ What is a ‘failure mode?’ Is this different to ‘functional failure mode?’ … or ‘physical failure mode?’ Too often these debates and arguments aren’t about progressing the conversation or solving a problem. A good number of them degrade into petty competitions of egos about who can be ‘right.’

But perhaps my favorite term that spawns many frivolous arguments is ‘root cause.’ If an aircraft turbine failed due to manufacturing-related ladder cracking that initiated fatigue failure later on … what is the ‘root cause?’ Is it ‘ladder cracking?’ NO. Here is why.

The only thing we can change is our BEHAVIORs. And what are ‘behaviors?’ Decisions. Everything we do, say, write, acknowledge or otherwise respond to is a decision we make. We can’t change physical phenomena. We can’t ever stop fatigue cracking from being a failure mechanism.

The only thing we can change is the decisions that create the conditions for failure to occur. In the case of our aircraft turbine? The ladder cracking (at the time) was a well-known issue associated with manufacturing these turbines. Coolant is applied to drill bit tips to reduce the risk of them occurring. But the risk is real.

Which is why (in this case) these turbines are supposed to be subjected to routine Non-destructive Inspection (NDI). Now because the maintenance crew in this (real life) scenario did not identify the fatigue crack, the aircraft was allowed to continue flying until this failure occurred.

So again … what is the root cause? Perhaps you might be thinking ‘poorly executed inspection activity’ or something similar. Still not there yet. We need to investigate further. Was the maintenance crew properly trained? If not, the decision to not properly train them is a (potential) root cause. Was the maintenance crew properly supervised? If not, we have another (potential) root cause.

But here is the most common ‘root cause’ I have come across through my reliability engineering career:

management teams DECIDING to avoid accountability.

Most maintenance crews are trained. A lot of them are well supervised. But a significant fraction of them is over-tasked, under-paid, or time-poor. The management team’s solution is to simply delegate an ever-increasing swag of tasks to the worker bees of the organization to the point that it becomes infeasible.

Have you ever been put in a position where your boss gives you (for example) ten tasks to complete in the next three months, and you only have the time or resources to do half of them? When you raise this issue with your boss and essentially try to force them to choose what your prioritized tasks need to be, do they reflect it back to you with ‘motherhood’ statements that absolve them of any responsibility?

I have. I bet you have too. And that is the root cause of many failures that cost you, your team, and broader society very deeply.

So when it comes to ‘root causes’ of failure, don’t stop until you identify an unambiguous decision (or failure to make a decision) by someone with the authority to control the ‘context.’

Depending on where you stand with semantics … this is often ‘culture.’

[display_form id=396]

Filed Under: Articles, on Product Reliability, Reliability in Emerging Technology Tagged With: Root Cause Analysis (RCA)

About Christopher Jackson

Chris is a reliability engineering teacher ... which means that after working with many organizations to make lasting cultural changes, he is now focusing on developing online, avatar-based courses that will hopefully make the 'complex' art of reliability engineering into a simple, understandable activity that you feel confident of doing (and understanding what you are doing).

« Defining & Achieving the Reliability Culture
Technology Readiness Level (TRL) »

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Article by Chris Jackson
in the Reliability in Emerging Technology series

Join Accendo

Receive information and updates about articles and many other resources offered by Accendo Reliability by becoming a member.

It’s free and only takes a minute.

Join Today

Recent Posts

  • Gremlins today
  • The Power of Vision in Leadership and Organizational Success
  • 3 Types of MTBF Stories
  • ALT: An in Depth Description
  • Project Email Economics

© 2025 FMS Reliability · Privacy Policy · Terms of Service · Cookies Policy