Accendo Reliability

Your Reliability Engineering Professional Development Site

  • Home
  • About
    • Contributors
    • About Us
    • Colophon
    • Survey
  • Reliability.fm
  • Articles
    • CRE Preparation Notes
    • NoMTBF
    • on Leadership & Career
      • Advanced Engineering Culture
      • ASQR&R
      • Engineering Leadership
      • Managing in the 2000s
      • Product Development and Process Improvement
    • on Maintenance Reliability
      • Aasan Asset Management
      • AI & Predictive Maintenance
      • Asset Management in the Mining Industry
      • CMMS and Maintenance Management
      • CMMS and Reliability
      • Conscious Asset
      • EAM & CMMS
      • Everyday RCM
      • History of Maintenance Management
      • Life Cycle Asset Management
      • Maintenance and Reliability
      • Maintenance Management
      • Plant Maintenance
      • Process Plant Reliability Engineering
      • RCM Blitz®
      • ReliabilityXperience
      • Rob’s Reliability Project
      • The Intelligent Transformer Blog
      • The People Side of Maintenance
      • The Reliability Mindset
    • on Product Reliability
      • Accelerated Reliability
      • Achieving the Benefits of Reliability
      • Apex Ridge
      • Field Reliability Data Analysis
      • Metals Engineering and Product Reliability
      • Musings on Reliability and Maintenance Topics
      • Product Validation
      • Reliability by Design
      • Reliability Competence
      • Reliability Engineering Insights
      • Reliability in Emerging Technology
      • Reliability Knowledge
    • on Risk & Safety
      • CERM® Risk Insights
      • Equipment Risk and Reliability in Downhole Applications
      • Operational Risk Process Safety
    • on Systems Thinking
      • Communicating with FINESSE
      • The RCA
    • on Tools & Techniques
      • Big Data & Analytics
      • Experimental Design for NPD
      • Innovative Thinking in Reliability and Durability
      • Inside and Beyond HALT
      • Inside FMEA
      • Institute of Quality & Reliability
      • Integral Concepts
      • Learning from Failures
      • Progress in Field Reliability?
      • R for Engineering
      • Reliability Engineering Using Python
      • Reliability Reflections
      • Statistical Methods for Failure-Time Data
      • Testing 1 2 3
      • The Manufacturing Academy
  • eBooks
  • Resources
    • Accendo Authors
    • FMEA Resources
    • Glossary
    • Feed Forward Publications
    • Openings
    • Books
    • Webinar Sources
    • Podcasts
  • Courses
    • Your Courses
    • Live Courses
      • Introduction to Reliability Engineering & Accelerated Testings Course Landing Page
      • Advanced Accelerated Testing Course Landing Page
    • Integral Concepts Courses
      • Reliability Analysis Methods Course Landing Page
      • Applied Reliability Analysis Course Landing Page
      • Statistics, Hypothesis Testing, & Regression Modeling Course Landing Page
      • Measurement System Assessment Course Landing Page
      • SPC & Process Capability Course Landing Page
      • Design of Experiments Course Landing Page
    • The Manufacturing Academy Courses
      • An Introduction to Reliability Engineering
      • Reliability Engineering Statistics
      • An Introduction to Quality Engineering
      • Quality Engineering Statistics
      • FMEA in Practice
      • Process Capability Analysis course
      • Root Cause Analysis and the 8D Corrective Action Process course
      • Return on Investment online course
    • Industrial Metallurgist Courses
    • FMEA courses Powered by The Luminous Group
    • Foundations of RCM online course
    • Reliability Engineering for Heavy Industry
    • How to be an Online Student
    • Quondam Courses
  • Calendar
    • Call for Papers Listing
    • Upcoming Webinars
    • Webinar Calendar
  • Login
    • Member Home
  • Barringer Process Reliability Introduction Course Landing Page
  • Upcoming Live Events
You are here: Home / Articles / Controlling the Human Factor 

by Mike Sondalini Leave a Comment

Controlling the Human Factor 

Controlling the Human Factor 

How to conquer the last frontier in equipment reliability, maintenance, and industrial engineering asset management: Our machines and materials of construction do not cause our equipment problems and failures. The real problem for industry is the ‘human factor’ in stopping people making mistakes. You gain control over the ‘human factor’ by providing clear and comprehensive work procedures that explain exactly how to deliver the performance required and that give users the means to check and improve their performance. The Accuracy Controlled Enterprise 3T procedure layout and content lets you provide your people with the details of how to do their work with exceptional quality, and the means to correct and improve their efforts until they are that good.

Keywords: human factor, human element, human error, equipment reliability, failure prevention, defect elimination, standard operating procedures 

 The following extracts are from three sources investigating industrial plant and equipment failures. 

“1 Many managers and engineers believe most failures have a root cause in the equipment. Data from nuclear power plants (which maintain a culture of confessing failures and the roots of failures—this is in opposition to most industries were the culture is to hide the roots of failures) show the following roots for failures: 

Early in the life of nuclear power plants – 

 Design error 35% [people induced problems – not calculation errors] 
Random component failures18% [process/procedure problems] 
Operator error12% [people/procedure problems] 
Maintenance error12% [people/procedure problems] 
Unknown12%
Procedure error & (procedure) unknowns10%
Fabrication error1% [people/procedure problems]
100%

Mature nuclear power plants – 

People38%
Procedures & Processes34%
Equipment28% 
100%

“2 ASME (2002 report) shows a similar root for failures. For 10 years, from 1992-2001, 127 people died from boiler and pressure vessel accidents and 720 people were injured. In the 23,338 accident reports, 83% were a direct result of human oversight or lack of knowledge. The same reasons were listed for 69% of the injuries and 60% of recorded deaths. Data shows that if you concentrate only on the equipment, you miss the best opportunities for making improvements. Another point to seriously consider is little or no capital expenditures are required for improving people, procedures, and processes which can reduce failures. In case you believe that equipment is the biggest root of problems it will be instructive to download (http://www.bpresponse.com) the Final Report of BP’s Texas City Refinery explosion and tick off the reasons behind the explosion which took the lives of 15 people and maimed more than 200 addition people—you will see objective evidence for people, procedures, and processes as the major roots for failures. The #1 problem was not equipment!” 

“3… the major challenge to reliability theory was recognized when the theoretical probabilities of failure were compared with actual rates of failure [and the] actual rates exceed the theoretical values by a factor of 10 or 100 or even more. They identified the main reason for the discrepancy to be that the theory of reliability employed did not consider the effect of human error…. Human error in anticipating failure continues to be the single most important factor in keeping the reliability of engineering designs from achieving the theoretically high levels made possible by modern methods of analysis and materials…. nine out of ten recent failures [in dams] occurred not because of inadequacies in the state of the art, but because of oversights that could and should have been avoided…. the problems are essentially non-quantitative, and the solutions are essentially non-numerical.” 


The above quotes are evidence that the problems we have with our plant and equipment are not machine problems. Our machines are fine. The problems of poor equipment reliability, poor maintenance and poor production performance are in the minds and hearts of the people that control our companies, design and manage our business processes, and run and maintain our machines. 

The reason you have so many equipment and production failures is that you, your people and your business processes cause them. That is what the evidence in the three extracts above proves. Human beings let happen all equipment failures that are not ‘Acts of God’. 

If you want to make serious improvements to your plant and equipment reliability you need to first focus all your efforts and resources on changing attitudes and beliefs. You need to change the way you and your people think about, and value, quality, and reliability. 

To move from a repair-focused organization where failure is seen as inevitable, where maintenance is a servant giving fast response to failures, and reliability is the responsibility of an ‘elite’, to a reliability-focused organization with a culture of failure elimination which permeates staff at all levels requires a mindset change. It is driven by a passionate management over a long time4. 

You start by installing the right processes and systems into your business that the people can follow. Read this quote about causing change in organisations.

“5 Changing collective values of adult people in an intended direction is extremely difficult, if not impossible. Values do change, but not according to someone’s master plan. Collective practices, however, depend on organisational characteristics like structures and systems, and can be influenced in more or less predictable ways by changing these.” 

You cannot change people’s internal values, but what you can change is the practices they must follow so that their cognitive dissonance brings about change in their values. Cognitive dissonance is the uncertainty and unhappiness that happens in your mind if you believe one thing but are forced to do something else. 

For example, if you want people to do high quality work, provide a high-quality procedure they must follow and a report sheet to complete and hand-up at the end of every job so you can encourage and train them to do masterly work. If when the procedures are exactly followed users produce better results than they ever achieved without them, people start to change belief. Their old internal values change because the external evidence does not support them. This is cognitive dissonance in action. In this way the quality requirements built into the procedures brings about the necessary change in the value people put-on careful observation, quality workmanship and accurate recording. You use your standard operating procedures to describe and create the ‘role model’ you want your people to follow. 

Creating Mind Changing Standard Operating Procedures 

Take for example this 12-monthly procedure used by an organisation to look after a dust collector fan and its drive. It is at a typical level of quality and content used in many organisations. It is a disaster waiting to happen. On a scale of 1 to 10, with 10 being exceptional, this procedure would rate a zero.

example of details included in a task description

On the surface the procedure looks passable. There are task instructions provided and it seems to cover the equipment that needs to be maintained. Unfortunately, this document will cause more equipment failures than it fixes. 

The problem is that there is no indication of the quality of work required and the condition that the equipment must be in after the maintenance is done. In this procedure the people doing the work are totally ignorant of the quality of workmanship they must do. They are left to their own devices to decide how to do the job, and to the work quality they must achieve. If your machines are maintained using such ‘zero-grade’ procedures, you have a very serious problem to address very quickly, because your maintenance documents are making your machines breakdown. You have not controlled the ‘human factors’ that we now know are the real destroyers of equipment reliability. 

Below I’ve partially re-written the procedure with more effort made to specify the quality of the work to be performed and the condition in which the machine parts are to be left when the work is done. It took four hours of work to make the changes. 

On the scale of 1 to 10, I would rate this procedure at a 3. It is a lot better than what it started as, but it will not deliver world-class equipment performance because it doesn’t help people improve themselves. To achieve world-class work quality performance requires a different sort of procedure.

You could argue that this re-written procedure can be greatly improved. I would totally agree with that. I have left it as you see it because its purpose is to make clear that you need to make your procedures ‘human factor’ proof. And you do that by making sure every procedure contains every scrap of detail, and all the standards needed to do the work to the quality required to get the reliability you want. 

If you do not want reliable equipment, simply do not tell your operators and maintainers how to deliver reliability. The ‘human factor’ will make sure you get a matching level of equipment performance. 

Always remember what W. Edwards Deming said: “Your system is perfectly design to give you the results that you get!” His quote truthfully explains why you get the results that you do; you designed them into your business systems, because you neglected to design them out! 

The Journey to World-Class needs Work Standardisation and Control of Work Quality 

Now I do not want to leave you wondering what you need to do to get a 10 on my scale of job procedure quality. If you want world-class reliability you need to specify exactly what that is, and exactly how to get it. You cannot leave it to the guesswork of engineers, operators, and technicians. You must exactly state what world-class quality looks like and make clear to people exactly what they need to do to get it. You need to change the layout and content of your procedures to the Accuracy Controlled Enterprise 3T with tolerance banding style. 

I’ve partially completed an ACE 3T (3T stands for Target, Tolerance, Test) procedure below for mating pipe flanges. I’d give it 5 out of 10. It is still not good enough because it has not yet been reviewed by two other experienced people and it has not yet been tested in the workplace to prove it produces the world-class performance it is meant to deliver. But it is already miles ahead of the quality and content value of flange installation procedures used by many companies. 

A 10-score level ACE 3T procedure that controls work quality and delivers high work quality assurance takes a lot of research and time to write. It needs the research to bring all the facts together. It can only be written by people that understand the engineering of the equipment and how its parts work and can fail. They must make decisions on the standards your people will always meet and on the quality of work you will always demand of them. 

You must change the mindsets and values of the people in your operation so that anything less than top-class is unacceptable. To make the journey to world-class you must seek, identify, and set world-class standards. You must make sure all your people have the intellectual capability, the quality of documentation, the depth of knowledge, the workmanship skills and precision tools to deliver those standards. In this way you control the ‘human factors’ that affect your plant and equipment, so they produce the right results.

Partially Complete Example of an ACE 3T Flange Bolting Procedure with Tolerance Banding 

This is an example of an Accuracy Controlled Enterprise (ACE) 3T procedure with tolerance bands to bolt together 80 NB, ANSI B36.5, forged steel, Class 150 flanges. Each task has a target with the allowed limits banded into ‘good, better, best’. It also provides instruction if the tolerance is not achieved. 

NOTE: The example covers the method to use to create a 3T procedure and is not the actual procedure to use when bolting-up flanges. Each organisation must research, develop and approve their safe practices and procedures for bolting flanges. The use of turn-of-nut on pressure flanges may not comply with the applicable pressure piping design codes. 

Flange Connection Procedure 

Importance of correctly mating flanges: This procedure explains how to bolt-up correctly a pipe flange on 80mm (3”) diameter pipe. Leaks of dangerous chemicals from pipe flanges create a safety and environmental hazard that can lead to death of workmates and the destruction of production plant and equipment. Even a water leak from a flange causes slip hazards and makes an unsightly mess. Pipe flanges must be bolted-up so they never leak. 

This procedure is our current best practice and you should follow it exactly. It is the result of many people’s efforts over many years. It is the quickest, best way yet found to do the job. You are encouraged to learn the job exactly as in this document. If after you master this procedure exactly, you believe that you know of improvements, please bring them forward for discussion. You can test your ideas and compare them to the procedure. If your suggestion proves to be better, it will become the new way of doing this job. 

Necessary Equipment and Tools: Approved gasket, ring spanners, sockets (do not use adjustable shifters and pipe wrenches as they damage corners of bolt heads and nuts making their removal dangerous and unsafe), suitably load-rated studs and nuts, pencil, feeler gauges. 

Task Summary 

A summary of the process of installing gaskets and making flanges is below. A fully detailed procedure is beneath the list. If you have a problem that you cannot solve please see your supervisor. 

1. Get work pack, tools, NEW fasteners and NEW gasket 

2. Get safe handover isolated and pipe drained 

3. Place personal danger tags, test if drained 

4. Break and spread flange safely 

5. Clean-up flange faces 

6. Check unrestrained pipe alignment 

7. Mount gasket and insert fasteners 

8. Pull-up fasteners snug tight in sequence 

9. Mark nut position and turn angle past snug 

10. Turn nuts to position in sequence 

11. Test flange for leakage at operating pressure 

12. Safely clean-up, hand-back, complete job record and sign-off Work Order 

NOTE: The example covers the method to use to create a 3T procedure and is not the actual procedure to use when bolting-up flanges. Each organisation must research, develop and approve their safe practices and procedures for bolting flanges. The use of turn-of-nut on pressure flanges may not comply with the applicable pressure piping design codes.

Notice how the procedure specifies the standard and quality that must be achieved on the job. Nothing affecting quality is left to the discretion of the person doing the job. The 3T procedure clearly states the minimum acceptable outcome, called ‘good’, and it clearly states what top-class performance is in the ‘best’ column. Now every one knows what ‘good enough is’ and what ‘the best’ looks like. 

Figure 1 – Controlling Variability in Producing High Work Quality 

The ACE 3T approach provides you with a practical and certain way to control work quality regardless of who does the job. Everyone now knows what quality work is and are encouraged to strive for it. A 3T procedure acts to remove variability. It creates statistical control over work processes involving human activity – it prevents human error. You know what a worker will try to do and what they will try to produce. 3T procedures standardize performance and deliver repeatable outcomes. This standardizing and repeatability effect on work quality is shown in Figure 1. Instead of having a wide range of possible results, the 3Ts limit the results to those you specify. 

You can get more information on how to write ACE 3T procedures that prevent human error at our website. 

My best regards to you, 

Mike Sondalini 

  1. Barringer, H. Paul, P.E. ‘Use Crow-AMSAA Reliability Growth Plots To Forecast Future System Failures’, www.barringer1.com   ↩︎
  2. Barringer, H. Paul, P.E. ‘Use Crow-AMSAA Reliability Growth Plots To Forecast Future System Failures’, www.barringer1.com   ↩︎
  3. Petroski, Henry, ‘Design Paradigms: Case Histories of Error and Judgment in Engineering’, Cambridge Press, New York, 1994. Remarks on Pages 7 and 8 about the role of humans in failures.   ↩︎
  4. Wardhaugh, Jim. Extract from 2004 Singapore IQPC Reliability and Maintenance Congress presentation ‘Maintenance – the best practices’ ↩︎
  5. Hofstede, G. J., Cultures and Organisations – Software of the Mind, Second Edition, McGraw-Hill   ↩︎

Filed Under: Articles, Life Cycle Asset Management, on Maintenance Reliability

About Mike Sondalini

In engineering and maintenance since 1974, Mike’s career extends across original equipment manufacturing, beverage processing and packaging, steel fabrication, chemical processing and manufacturing, quality management, project management, enterprise asset management, plant and equipment maintenance, and maintenance training. His specialty is helping companies build highly effective operational risk management processes, develop enterprise asset management systems for ultra-high reliable assets, and instil the precision maintenance skills needed for world class equipment reliability.

« Accelerated Life Testing (ALT) Part-4 Determination of Activation Energy for Arrhenius Model
Type of Contract You Bid on Effects Your Risk »

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Headshot of Mike SondaliniArticles by Mike Sondalini
in the Life Cycle Asset Management article series

Join Accendo

Receive information and updates about articles and many other resources offered by Accendo Reliability by becoming a member.

It’s free and only takes a minute.

Join Today

Recent Posts

  • How a Mission Statement Drives Behavioral Change in Organizations
  • Gremlins today
  • The Power of Vision in Leadership and Organizational Success
  • 3 Types of MTBF Stories
  • ALT: An in Depth Description

© 2025 FMS Reliability · Privacy Policy · Terms of Service · Cookies Policy