Accendo Reliability

Your Reliability Engineering Professional Development Site

  • Home
  • About
    • Contributors
    • About Us
    • Colophon
    • Survey
  • Reliability.fm
  • Articles
    • CRE Preparation Notes
    • NoMTBF
    • on Leadership & Career
      • Advanced Engineering Culture
      • ASQR&R
      • Engineering Leadership
      • Managing in the 2000s
      • Product Development and Process Improvement
    • on Maintenance Reliability
      • Aasan Asset Management
      • AI & Predictive Maintenance
      • Asset Management in the Mining Industry
      • CMMS and Maintenance Management
      • CMMS and Reliability
      • Conscious Asset
      • EAM & CMMS
      • Everyday RCM
      • History of Maintenance Management
      • Life Cycle Asset Management
      • Maintenance and Reliability
      • Maintenance Management
      • Plant Maintenance
      • Process Plant Reliability Engineering
      • RCM Blitz®
      • ReliabilityXperience
      • Rob’s Reliability Project
      • The Intelligent Transformer Blog
      • The People Side of Maintenance
      • The Reliability Mindset
    • on Product Reliability
      • Accelerated Reliability
      • Achieving the Benefits of Reliability
      • Apex Ridge
      • Field Reliability Data Analysis
      • Metals Engineering and Product Reliability
      • Musings on Reliability and Maintenance Topics
      • Product Validation
      • Reliability by Design
      • Reliability Competence
      • Reliability Engineering Insights
      • Reliability in Emerging Technology
      • Reliability Knowledge
    • on Risk & Safety
      • CERM® Risk Insights
      • Equipment Risk and Reliability in Downhole Applications
      • Operational Risk Process Safety
    • on Systems Thinking
      • Communicating with FINESSE
      • The RCA
    • on Tools & Techniques
      • Big Data & Analytics
      • Experimental Design for NPD
      • Innovative Thinking in Reliability and Durability
      • Inside and Beyond HALT
      • Inside FMEA
      • Institute of Quality & Reliability
      • Integral Concepts
      • Learning from Failures
      • Progress in Field Reliability?
      • R for Engineering
      • Reliability Engineering Using Python
      • Reliability Reflections
      • Statistical Methods for Failure-Time Data
      • Testing 1 2 3
      • The Manufacturing Academy
  • eBooks
  • Resources
    • Accendo Authors
    • FMEA Resources
    • Glossary
    • Feed Forward Publications
    • Openings
    • Books
    • Webinar Sources
    • Podcasts
  • Courses
    • Your Courses
    • Live Courses
      • Introduction to Reliability Engineering & Accelerated Testings Course Landing Page
      • Advanced Accelerated Testing Course Landing Page
    • Integral Concepts Courses
      • Reliability Analysis Methods Course Landing Page
      • Applied Reliability Analysis Course Landing Page
      • Statistics, Hypothesis Testing, & Regression Modeling Course Landing Page
      • Measurement System Assessment Course Landing Page
      • SPC & Process Capability Course Landing Page
      • Design of Experiments Course Landing Page
    • The Manufacturing Academy Courses
      • An Introduction to Reliability Engineering
      • Reliability Engineering Statistics
      • An Introduction to Quality Engineering
      • Quality Engineering Statistics
      • FMEA in Practice
      • Process Capability Analysis course
      • Root Cause Analysis and the 8D Corrective Action Process course
      • Return on Investment online course
    • Industrial Metallurgist Courses
    • FMEA courses Powered by The Luminous Group
    • Foundations of RCM online course
    • Reliability Engineering for Heavy Industry
    • How to be an Online Student
    • Quondam Courses
  • Calendar
    • Call for Papers Listing
    • Upcoming Webinars
    • Webinar Calendar
  • Login
    • Member Home
  • Barringer Process Reliability Introduction Course Landing Page
  • Upcoming Live Events
You are here: Home / Articles / Are You Tracking and Reporting Field Failures Well?

by Fred Schenkelberg 2 Comments

Are You Tracking and Reporting Field Failures Well?

Are You Tracking and Reporting Field Failures Well?

Fielded products fail day by day. Customers report these failures generally seeking a way to remedy this issue. Gathering the reported or returned products or confirmed failures is common practice.

Depending on the product a simple replacement or exchange may suffice. For other products, repair or a refund may be appropriate.

In general, and not always, when a product fails in the hands of a customer, the organization designing, manufacturing and distributing the product learns of the failure.

A common practice is to count the number of returns per week or month. Counting as the items arrive. This tally per month is then easy to plot using a simple bar chart showing the count of returns per month over time.

The issue is that the number of units shipped change month to month the number of items that could possibly failure changes. The number of field failures could double even when the actual failure rate for products has not changed when we ship twice as many units.

A Very Simple Example

Let’s look at a very simple example.

If a new product is a 10% failure rate in the first month and no failures after the first month, and we ship 100 units. The first month we would receive 10 failed units back. If this occurs for the first three months of the year, we ship 100 units per month and we would receive 10 units back each month.

Now let’s say in April another customer orders an additional 100 units, thus we ship 200 items. Given the same failure rate, we would receive 20 units back. That effectively doubles the number of returns month over month. A 100% increase in field returns per month.

In this very simple example, it is obvious the number of units shipped doubled and the tracking failure rate would be an appropriate measure as we are interested in noticing a change in failure rate. Being able to identify such a change permits identification and resolution of the contributing factors causing the increase in failure rate. Or, the continuation of the causes of a lower failure rate.

Two things complicate this approach. Both the number of units produced and shipped vary, and the chance of a specific unit failing changes over time.

Shipments Vary

First, we often change the actual number of shipments per unit time.

While the forecast for shipments or sales may include nice round numbers per month, in reality, it is often quite variable. If the average shipments per month is planned to be 5,000 units the long term average may work out to be 5k units per month, yet the actual month shipments may vary.

The first month may be only 100 units, as production started just days before the end of the month. The next month, as the production capability ramped up the production line, they only could produce thus ship only 2,523 units. The third month in order to meet early demand the team works overtime and creates 6,467 units. And so on.

Variation in product capability, availability of necessary components and materials, holidays (production shut down), changes in customer demand, and many other elements change how many units are actually produced and shipped per month.

Failure Rates Vary

Even simple products have dozens if not hundreds or thousands of way it can fail.

Each failure mechanism has a finite probability of occurring any specific day. It’s a race to see which failure mechanism succeeds in causing a failure.

For a specific product that experienced an error during assembly, say a missing component for a specific function, let’s say it somehow shipped to a customer. It may fail immediately on first use, or it may lie dormant for months before that specific function is called into action and then exhibits the failure. Or, the missing part could lead to slow degradation of a function over many years only resulting in a reported failure many years after first use.

The same basic variability applies for each specific failure mechanism. A wear out mechanism may occur early with aggressive overuse, or only after an exceptionally long period of light infrequent use. Corrosion related failure mechanisms may occur quickly or not at all given the local humidity conditions.

In general, there is some pattern to specific failure mechanisms, yet they do exhibit variability of when failures occur.

Still a Simple Example

Let’s complicate the simple example described above. Instead of a fixed first-month failure rate of 10% let’s say it has the following number of returns given 100 units initially shipped.

Month Returns

Jan          1

Feb          5

Mar         4

At the end of three months, the total failure rate is 10%, yet the first month is was only 1%, then jumped to 6% the second month.

Now let’s imagine this organization ships 100 units in February and then again in March and each month’s production follows the same failure pattern. What would that look like over the first three months of production tracking cumulative shipments and returns per month?

Month Returns Shipments

Jan          1            100

Feb          7            200

Mar        17           300

Plotting the number of failures per month alone in not informative. Plotting the failure rate per month accounts for the number of units shipped, yet again is not very informative. The three months cumulative failure rate is 1%, 3.5%, and 5.6%.

The problem is customers after three months have a 10% chance of product failure, not 5.6%. Tracking cumulative failure rates using the cumulative number of returns and shipments under reports the failure rate in this case for customers that have the initial month’s units, as those units are now three months old. It may take many more months to recognize the underlying pattern of failures based on the age of the individual units.

Tracking and reporting based on the age of the unit is a better approach. Time to failure analysis of the data allows us to consider the probability of failure over time, just as the customer experiences the product.

The next article will describe a convenient way to track shipments and returns which allows the preservation of the time to failure information. How do you gather and report your field data?


Related:

When to Take Action on Field Failure Data (article)

Field Data Analysis First Look (article)

Failure Analysis: The Key to Learning from Failure (article)

Filed Under: Articles, Musings on Reliability and Maintenance Topics, on Product Reliability Tagged With: Field data analysis

About Fred Schenkelberg

I am the reliability expert at FMS Reliability, a reliability engineering and management consulting firm I founded in 2004. I left Hewlett Packard (HP)’s Reliability Team, where I helped create a culture of reliability across the corporation, to assist other organizations.

« Solving a Reliability Optimization Example
Incorporating Reliability into Your Future »

Comments

  1. Hanh Nguyen says

    December 1, 2019 at 7:52 PM

    Thanks for your article. In this article, you stated the following:

    “Tracking and reporting based on the age of the unit is a better approach. Time to failure analysis of the data allows us to consider the probability of failure over time, just as the customer experiences the product.”

    Do you have an article about this approach?

    Reply
    • Fred Schenkelberg says

      December 2, 2019 at 9:22 AM

      Hi Hanh,

      Thanks for reading the article and your question.

      Yes, we do have many articles and podcast episodes on the topic. The overall concept of data analysis is broadly covered in these articles

      https://fred-schenkelberg-project.prev01.rmkr.net/?s=data+analysis

      and one specific approach, Weibull analysis, is covered in these articles:

      https://fred-schenkelberg-project.prev01.rmkr.net/?s=weibull+analysis

      the first result in the Weibull list is a plotting tool (you will need to be logged into the site and signed up to review the tool to view/use it).

      Cheers,

      Fred

      Reply

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Article by Fred Schenkelberg
in the Musings series

Join Accendo

Receive information and updates about articles and many other resources offered by Accendo Reliability by becoming a member.

It’s free and only takes a minute.

Join Today

Recent Articles

  • Gremlins today
  • The Power of Vision in Leadership and Organizational Success
  • 3 Types of MTBF Stories
  • ALT: An in Depth Description
  • Project Email Economics

© 2025 FMS Reliability · Privacy Policy · Terms of Service · Cookies Policy