Accendo Reliability

Your Reliability Engineering Professional Development Site

  • Home
  • About
    • Contributors
    • About Us
    • Colophon
    • Survey
  • Reliability.fm
  • Articles
    • CRE Preparation Notes
    • NoMTBF
    • on Leadership & Career
      • Advanced Engineering Culture
      • ASQR&R
      • Engineering Leadership
      • Managing in the 2000s
      • Product Development and Process Improvement
    • on Maintenance Reliability
      • Aasan Asset Management
      • AI & Predictive Maintenance
      • Asset Management in the Mining Industry
      • CMMS and Maintenance Management
      • CMMS and Reliability
      • Conscious Asset
      • EAM & CMMS
      • Everyday RCM
      • History of Maintenance Management
      • Life Cycle Asset Management
      • Maintenance and Reliability
      • Maintenance Management
      • Plant Maintenance
      • Process Plant Reliability Engineering
      • RCM Blitz®
      • ReliabilityXperience
      • Rob’s Reliability Project
      • The Intelligent Transformer Blog
      • The People Side of Maintenance
      • The Reliability Mindset
    • on Product Reliability
      • Accelerated Reliability
      • Achieving the Benefits of Reliability
      • Apex Ridge
      • Field Reliability Data Analysis
      • Metals Engineering and Product Reliability
      • Musings on Reliability and Maintenance Topics
      • Product Validation
      • Reliability by Design
      • Reliability Competence
      • Reliability Engineering Insights
      • Reliability in Emerging Technology
      • Reliability Knowledge
    • on Risk & Safety
      • CERM® Risk Insights
      • Equipment Risk and Reliability in Downhole Applications
      • Operational Risk Process Safety
    • on Systems Thinking
      • Communicating with FINESSE
      • The RCA
    • on Tools & Techniques
      • Big Data & Analytics
      • Experimental Design for NPD
      • Innovative Thinking in Reliability and Durability
      • Inside and Beyond HALT
      • Inside FMEA
      • Institute of Quality & Reliability
      • Integral Concepts
      • Learning from Failures
      • Progress in Field Reliability?
      • R for Engineering
      • Reliability Engineering Using Python
      • Reliability Reflections
      • Statistical Methods for Failure-Time Data
      • Testing 1 2 3
      • The Manufacturing Academy
  • eBooks
  • Resources
    • Accendo Authors
    • FMEA Resources
    • Glossary
    • Feed Forward Publications
    • Openings
    • Books
    • Webinar Sources
    • Podcasts
  • Courses
    • Your Courses
    • Live Courses
      • Introduction to Reliability Engineering & Accelerated Testings Course Landing Page
      • Advanced Accelerated Testing Course Landing Page
    • Integral Concepts Courses
      • Reliability Analysis Methods Course Landing Page
      • Applied Reliability Analysis Course Landing Page
      • Statistics, Hypothesis Testing, & Regression Modeling Course Landing Page
      • Measurement System Assessment Course Landing Page
      • SPC & Process Capability Course Landing Page
      • Design of Experiments Course Landing Page
    • The Manufacturing Academy Courses
      • An Introduction to Reliability Engineering
      • Reliability Engineering Statistics
      • An Introduction to Quality Engineering
      • Quality Engineering Statistics
      • FMEA in Practice
      • Process Capability Analysis course
      • Root Cause Analysis and the 8D Corrective Action Process course
      • Return on Investment online course
    • Industrial Metallurgist Courses
    • FMEA courses Powered by The Luminous Group
    • Foundations of RCM online course
    • Reliability Engineering for Heavy Industry
    • How to be an Online Student
    • Quondam Courses
  • Calendar
    • Call for Papers Listing
    • Upcoming Webinars
    • Webinar Calendar
  • Login
    • Member Home
  • Barringer Process Reliability Introduction Course Landing Page
  • Upcoming Live Events
You are here: Home / Articles / Introduction to Ongoing Reliability Testing

by Fred Schenkelberg Leave a Comment

Introduction to Ongoing Reliability Testing

Introduction to Ongoing Reliability Testing

This type of reliability may have different names. A quick search of a few references in my library and I didn’t find ongoing reliability testing, ORT, in any of them.

It does exist and you may have heard of it before or even use some form of ORT. Or not.

Ongoing reliability testing or ORT is the continued evaluation of your product typically using samples drawn from production. The testing evaluates the reliability performance of recent production units.

The focus is on finding anomalies or changes that may occur in the design, supply chain, or production process that significantly changes field reliability performance.

What ORT Is and Is Not

ORT is some form of life test. It may be an accelerated life test, ALT, or it may rely on operating the test samples in as close to use conditions as possible, i.e. real-time testing.

ORT is not the same as burn-in or HASS approaches as ORT is not a screen to expose early life failures. ORT is not a verification of engineering changes, although it may be used in part for that function.

ORT is not lot sampling, although the sampling aspect of ORT may resemble lot sampling.

ORT draws samples on a regular basis from production.

The testing evaluates the samples for adverse changes in reliability.

It provides an early warning in most cases of unwanted changes that impact the durability or longevity of products recently introduced to the field.

An accelerated life test ORT example

A relatively high volume hand-held game controller began production with the attachment of the pop filter enjoying a large amount of variability.

We knew from testing during development that too little or too much adhesive along with variations in the component diameter would lead to premature failure.

We also knew the largest risk of damage during use was due to drops and the resulting shock, vibration, and associated accumulated damage.

Combining the uncertainty concerning how many drops and under what conditions, along with the range of damage that may occur per drop, we didn’t have a clear sense the design had sufficient margin to meet reliability targets.

We decided to conduct ORT, in this case, drop testing, in order to monitor production stability until upstream adhesive and dimension control processes improved.

We also wanted to monitor the durability of the device in order to detect adverse changes to the robustness to drop damage.

The testing sampled two units from each production line at random each week. The test replicated the drop testing done during development. Drop from 2 meters with random alignment till failure.

Count drops till screen separation and count drops till functional failure.

We used a CUSUM control chart to detect significant changes in the number of drops to these two failures.

Upon a signal of an adverse change, we conducted a detailed failure analysis to determine the root cause of the change. When possible we then made adjustments or changes within the design, supply chain, or production process to re-establish prior robustness or to make improvements.

The sampling and testing provided information about changes that may have occurred up to a week (or more) ago, thus placing all production during that time frame in a suspect of higher than expected failure rates.

Sampling more often would have reduced the number of units at risk.

Within a few weeks, this ORT detected the unknown introduction of a manual adhesive application process and a change in stiffness in the circuit board due to a board manufacturer change to a new site and slight changes in their lamination process.

Without ORT both changes would have only been detected through reported field problems.

The time till the failures occurred would mean many more units would have included the undesired changes and resulted in much higher field failures.

 

 

A real-time ORT example

Inkjet printers have different use patterns by different customers.

Some print rarely, maybe a couple of pages a week. Others print full-color pages all day (think realtor flyers for a home for sale).

Different failure mechanisms occur depending, in part, on the use of the printer. The number and complexity of the potential failure mechanism makes evaluating each using a focused ALT approach impractical.

While some failure mechanisms were expected to more prominent than other for the various use cases, the desire was to evaluate current production for as many possible failure mechanisms as possible.

Thus the team decided on a real-time operation of a sample of production units.

The primary concern was on failures that occur within the first 2 months of installation. Therefore test scripts were created to replicate use patterns for the 5 different customer profiles defined in the engineering and marketing requirements documents.

Some printers printed two black and white pages per week on average, while others printed hundreds of full-color flyers per day.

A possible sample plan may draw five units per week from production.

The test facility would require sufficient capacity to operating 5 units per week for 8 weeks, thus would need births for 40 units. In practice, given the volume and complexity of production, the testing had the capacity to test many more units at a time.

As units finished the 8 weeks they were replaced by units from that week’s production.

Failed units were examined for the root cause of the failure and remedial action taken to minimize field failures.

The testing caught a change to the pick roller which went from assembling the tacky rubber roller onto a shaft to molding the part directly onto the shaft.

The change included a mold release agent which eventually would bloom to the surface of the roller and prevent the pick roller from picking up one sheet of paper to start the paper movement from the paper tray into the printer.

This failure mechanism would impact a large proportion of all printers.

The heavy users may not experience the issue as the bloomed material would rub off with regular printing, while less than very heavy use would allow the release agent to accumulate and cause the failure.

In this case, the ORT finding provided just a few weeks lead time to respond to the field failures. The team found a solution for fielded units, alerted call centers and customers of a maintenance plan of action to remedy the issue.

Plus the team changed the design away from using the mold release agent underlying the issue.

Test selection

ORT is not a specific type of reliability test.

It has to focus the testing on the risk of failure. This may be a specific failure mechanism, or a type of use stress, or an exploratory test looking for anomalies.

Nearly any type of reliability test could be used for ORT.

What makes ORT unique is it focuses on the recent production units and attempts to detect changes in produced units that impact field reliability.

Sampling consideration and risk management

The testing is destructive. Or at least significantly ages product such they are unfit for sale.

This typically reduces the number of samples allocated for ORT.

The balance of sample size and risk of undetected failures is further complicated by the ongoing nature of the testing. Pulling samples more often lessens the risk of undetected changes impacting a significant amount of production.

Sampling less often means more unit may have the potential adverse changes leading to premature failures.

The risk of increased failures versus the cost of sampling and testing requires careful consideration and planning.

The sampling naturally includes its inherent ability to detect changes in a population, thus the lower the failure rate that signals a change the more samples required.

The ability to detect a change could be set at a point that the team would take action to resolve the problem, or at a recall or stop production size problem.

Again, this is a discussion with our team to balance risk and cost in order to craft the right ORT test plan.

Summary

ORT is a useful means to minimize the risk of unwanted changes adversely leading to unacceptable field failure rates.

Once the design is complete and production starts it does not mean the variability stops changing.

Changes in your supplied parts, in your production process, and even due to engineering modifications to the design, all have a risk of causing significant field failures.

ORT is one way to detect the changes before your customer alerts you to the issue.

Filed Under: Articles, Musings on Reliability and Maintenance Topics, on Product Reliability Tagged With: Ongoing Reliability Testing (ORT)

About Fred Schenkelberg

I am the reliability expert at FMS Reliability, a reliability engineering and management consulting firm I founded in 2004. I left Hewlett Packard (HP)’s Reliability Team, where I helped create a culture of reliability across the corporation, to assist other organizations.

« 10 Things a Maintenance Supervisor Can Do to Improve Reliability
Definition of Risk Related to Reliability »

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Article by Fred Schenkelberg
in the Musings series

Join Accendo

Receive information and updates about articles and many other resources offered by Accendo Reliability by becoming a member.

It’s free and only takes a minute.

Join Today

Recent Articles

  • Gremlins today
  • The Power of Vision in Leadership and Organizational Success
  • 3 Types of MTBF Stories
  • ALT: An in Depth Description
  • Project Email Economics

© 2025 FMS Reliability · Privacy Policy · Terms of Service · Cookies Policy