Accendo Reliability

Your Reliability Engineering Professional Development Site

  • Home
  • About
    • Contributors
    • About Us
    • Colophon
    • Survey
  • Reliability.fm
  • Articles
    • CRE Preparation Notes
    • NoMTBF
    • on Leadership & Career
      • Advanced Engineering Culture
      • ASQR&R
      • Engineering Leadership
      • Managing in the 2000s
      • Product Development and Process Improvement
    • on Maintenance Reliability
      • Aasan Asset Management
      • AI & Predictive Maintenance
      • Asset Management in the Mining Industry
      • CMMS and Maintenance Management
      • CMMS and Reliability
      • Conscious Asset
      • EAM & CMMS
      • Everyday RCM
      • History of Maintenance Management
      • Life Cycle Asset Management
      • Maintenance and Reliability
      • Maintenance Management
      • Plant Maintenance
      • Process Plant Reliability Engineering
      • RCM Blitz®
      • ReliabilityXperience
      • Rob’s Reliability Project
      • The Intelligent Transformer Blog
      • The People Side of Maintenance
      • The Reliability Mindset
    • on Product Reliability
      • Accelerated Reliability
      • Achieving the Benefits of Reliability
      • Apex Ridge
      • Field Reliability Data Analysis
      • Metals Engineering and Product Reliability
      • Musings on Reliability and Maintenance Topics
      • Product Validation
      • Reliability by Design
      • Reliability Competence
      • Reliability Engineering Insights
      • Reliability in Emerging Technology
      • Reliability Knowledge
    • on Risk & Safety
      • CERM® Risk Insights
      • Equipment Risk and Reliability in Downhole Applications
      • Operational Risk Process Safety
    • on Systems Thinking
      • Communicating with FINESSE
      • The RCA
    • on Tools & Techniques
      • Big Data & Analytics
      • Experimental Design for NPD
      • Innovative Thinking in Reliability and Durability
      • Inside and Beyond HALT
      • Inside FMEA
      • Institute of Quality & Reliability
      • Integral Concepts
      • Learning from Failures
      • Progress in Field Reliability?
      • R for Engineering
      • Reliability Engineering Using Python
      • Reliability Reflections
      • Statistical Methods for Failure-Time Data
      • Testing 1 2 3
      • The Manufacturing Academy
  • eBooks
  • Resources
    • Accendo Authors
    • FMEA Resources
    • Glossary
    • Feed Forward Publications
    • Openings
    • Books
    • Webinar Sources
    • Podcasts
  • Courses
    • Your Courses
    • Live Courses
      • Introduction to Reliability Engineering & Accelerated Testings Course Landing Page
      • Advanced Accelerated Testing Course Landing Page
    • Integral Concepts Courses
      • Reliability Analysis Methods Course Landing Page
      • Applied Reliability Analysis Course Landing Page
      • Statistics, Hypothesis Testing, & Regression Modeling Course Landing Page
      • Measurement System Assessment Course Landing Page
      • SPC & Process Capability Course Landing Page
      • Design of Experiments Course Landing Page
    • The Manufacturing Academy Courses
      • An Introduction to Reliability Engineering
      • Reliability Engineering Statistics
      • An Introduction to Quality Engineering
      • Quality Engineering Statistics
      • FMEA in Practice
      • Process Capability Analysis course
      • Root Cause Analysis and the 8D Corrective Action Process course
      • Return on Investment online course
    • Industrial Metallurgist Courses
    • FMEA courses Powered by The Luminous Group
    • Foundations of RCM online course
    • Reliability Engineering for Heavy Industry
    • How to be an Online Student
    • Quondam Courses
  • Calendar
    • Call for Papers Listing
    • Upcoming Webinars
    • Webinar Calendar
  • Login
    • Member Home
  • Barringer Process Reliability Introduction Course Landing Page
  • Upcoming Live Events
You are here: Home / Articles / Deciding What Should Have Fault Tolerance

by Fred Schenkelberg Leave a Comment

Deciding What Should Have Fault Tolerance

Deciding What Should Have Fault Tolerance

In some circumstances, it is desirable to ensure the system continues to operate even if there is an internal failure. An aircraft navigation system should be able to operate even if an internal dc-dc regulator fails, for example.

Not everything within some systems benefits by being fault tolerant.

For example, a failure of a cabin reading light over a passenger seat is not critical to the safe operation of the aircraft, thus is likely not created to be fault tolerant. One criterion to determine what should be fault tolerant is the criticality of the function the system provides.

This also applies to specific subsystems within a system allowing some elements to be created fault tolerant and others within the system not.

How to Determine What to Should be Fault Tolerant

Besides considering the function, the elements (components) of a system each contribute to the overall functional performance.

Evaluating each element using a few questions may help determine the specific elements that may benefit with fault tolerance.

How critical is the component to the essential function? If the component creates a single point failure for the essential function it is a candidate.

How likely is the component to fail?

This becomes a tradeoff considering the added complexity of a fault tolerant design versus the need to employ the alternate means to provide the function if an element fails. Complexity, weight, and cost are just a few of the factors to consider. If an element is very unlikely to fail, the effort to create a fault tolerant system may be wasting resources better used to improve other elements of the system.

How expensive are the fault tolerant options? Also, part of the tradeoff, the more expensive and complex the fault tolerant designs both the criticality and the likelihood of failure must likewise be high.

Some systems fault tolerance is obvious, such as 911 emergency services phone systems, some banking and commerce system, safety critical systems of aircraft, public transit, or chemical plants. And sometimes the decision to add some level of fault tolerance is not so clear and requires careful consideration of the costs, benefits, and criticality.


Related:

The Downside of a Fault Tolerant System (article)

Fault Tolerance Basics (article)

Benefits of Fault Tree Analysis (article)

 

Filed Under: Articles, CRE Preparation Notes, Reliability in Design and Development Tagged With: Fault tolerance

About Fred Schenkelberg

I am the reliability expert at FMS Reliability, a reliability engineering and management consulting firm I founded in 2004. I left Hewlett Packard (HP)’s Reliability Team, where I helped create a culture of reliability across the corporation, to assist other organizations.

« The Derating & Safety Margin Manual
Key Elements for Your Project Specific Reliability Plan »

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

CRE Preparation Notes

Article by Fred Schenkelberg

Join Accendo

Join our members-only community for full access to exclusive eBooks, webinars, training, and more.

It’s free and only takes a minute.

Get Full Site Access

Not ready to join?
Stay current on new articles, podcasts, webinars, courses and more added to the Accendo Reliability website each week.
No membership required to subscribe.

[popup type="" link_text="Get Weekly Email Updates" link_class="button" ][display_form id=266][/popup]

  • CRE Preparation Notes
  • CRE Prep
  • Reliability Management
  • Probability and Statistics for Reliability
  • Reliability in Design and Development
  • Reliability Modeling and Predictions
  • Reliability Testing
  • Maintainability and Availability
  • Data Collection and Use

© 2025 FMS Reliability · Privacy Policy · Terms of Service · Cookies Policy