Accendo Reliability

Your Reliability Engineering Professional Development Site

  • Home
  • About
    • Contributors
    • About Us
    • Colophon
    • Survey
  • Reliability.fm
  • Articles
    • CRE Preparation Notes
    • NoMTBF
    • on Leadership & Career
      • Advanced Engineering Culture
      • ASQR&R
      • Engineering Leadership
      • Managing in the 2000s
      • Product Development and Process Improvement
    • on Maintenance Reliability
      • Aasan Asset Management
      • AI & Predictive Maintenance
      • Asset Management in the Mining Industry
      • CMMS and Maintenance Management
      • CMMS and Reliability
      • Conscious Asset
      • EAM & CMMS
      • Everyday RCM
      • History of Maintenance Management
      • Life Cycle Asset Management
      • Maintenance and Reliability
      • Maintenance Management
      • Plant Maintenance
      • Process Plant Reliability Engineering
      • RCM Blitz®
      • ReliabilityXperience
      • Rob’s Reliability Project
      • The Intelligent Transformer Blog
      • The People Side of Maintenance
      • The Reliability Mindset
    • on Product Reliability
      • Accelerated Reliability
      • Achieving the Benefits of Reliability
      • Apex Ridge
      • Field Reliability Data Analysis
      • Metals Engineering and Product Reliability
      • Musings on Reliability and Maintenance Topics
      • Product Validation
      • Reliability by Design
      • Reliability Competence
      • Reliability Engineering Insights
      • Reliability in Emerging Technology
      • Reliability Knowledge
    • on Risk & Safety
      • CERM® Risk Insights
      • Equipment Risk and Reliability in Downhole Applications
      • Operational Risk Process Safety
    • on Systems Thinking
      • Communicating with FINESSE
      • The RCA
    • on Tools & Techniques
      • Big Data & Analytics
      • Experimental Design for NPD
      • Innovative Thinking in Reliability and Durability
      • Inside and Beyond HALT
      • Inside FMEA
      • Institute of Quality & Reliability
      • Integral Concepts
      • Learning from Failures
      • Progress in Field Reliability?
      • R for Engineering
      • Reliability Engineering Using Python
      • Reliability Reflections
      • Statistical Methods for Failure-Time Data
      • Testing 1 2 3
      • The Manufacturing Academy
  • eBooks
  • Resources
    • Accendo Authors
    • FMEA Resources
    • Glossary
    • Feed Forward Publications
    • Openings
    • Books
    • Webinar Sources
    • Podcasts
  • Courses
    • Your Courses
    • Live Courses
      • Introduction to Reliability Engineering & Accelerated Testings Course Landing Page
      • Advanced Accelerated Testing Course Landing Page
    • Integral Concepts Courses
      • Reliability Analysis Methods Course Landing Page
      • Applied Reliability Analysis Course Landing Page
      • Statistics, Hypothesis Testing, & Regression Modeling Course Landing Page
      • Measurement System Assessment Course Landing Page
      • SPC & Process Capability Course Landing Page
      • Design of Experiments Course Landing Page
    • The Manufacturing Academy Courses
      • An Introduction to Reliability Engineering
      • Reliability Engineering Statistics
      • An Introduction to Quality Engineering
      • Quality Engineering Statistics
      • FMEA in Practice
      • Process Capability Analysis course
      • Root Cause Analysis and the 8D Corrective Action Process course
      • Return on Investment online course
    • Industrial Metallurgist Courses
    • FMEA courses Powered by The Luminous Group
    • Foundations of RCM online course
    • Reliability Engineering for Heavy Industry
    • How to be an Online Student
    • Quondam Courses
  • Calendar
    • Call for Papers Listing
    • Upcoming Webinars
    • Webinar Calendar
  • Login
    • Member Home
  • Barringer Process Reliability Introduction Course Landing Page
  • Upcoming Live Events
You are here: Home / Articles / Is Warranty Big Data?

by Dennis Craggs Leave a Comment

Is Warranty Big Data?

Is Warranty Big Data?

 Vehicle Warranty Big Data

Automobile companies pay dealers to perform vehicle prep and make warranty repairs on customer vehicles. A lot of data is collected thus warranty is considered big data.

Generally, vehicle warranty covers 3 years or 36,000 miles. When a vehicle is serviced, the customer, vehicle, repair, and text data are collected. Claims are entered into a transaction database, may be rejected for a number of reasons, and then resubmitted until resolved. The transaction database covers many model years, millions of vehicles, and a number of warranty claims for each vehicle. As a rough estimate, assuming 10 years, 2 million vehicles/year, and 2 claims for each vehicle, yields 40 million records. The actual number is higher due to the submission, rejections, and resubmission cycle.

Primary Warranty Data Usage

Routine warranty studies are used to identify high cost and high failure rate components, systems, and software needing improvement. These investigations are repeated for the total production and many subsets including individual vehicle lines, assembly plants, … Many departments are devoted to maintaining the database, to identifying problems, to extrapolating trends, and looking for improvements.

Customer Usage Metrics

While the focus in on identifying problems, the warranty may be used to refine high-level vehicle usage metrics. A top-level metric is the rate of mileage accumulation of the fleet or types of vehicles. This metric is used to define validation targets to verify components, systems, and vehicles meet business objectives of low cost and low failure rates.

For the rate calculations, the mileage, the service date, and the sold date are required. The mileages and service dates come from the claims database. The sold dates come from a sales database. From the date information, the number of days in service is calculated as the difference between the service date and the sold date. For this example, the rate is the miles per month and each month is assumed to be 30 days. For each claim, the equation used to calculate the rate per month is easy, using formula 1.

$$ \large\displaystyle R=\frac{30*M}{D} $$

(1)

Where R is the calculated mileage accumulation rate, M is the vehicle mileage at warranty service, and D is the number of calendar days the vehicle has been in service. A unique value is calculated for each vehicle.

Errors and Filtering

Most of the time, the rate follows a steady trend, but some anomalies occur. For instance, on a date where the vehicle was serviced, the odometer was expected to be about 30,000 miles. Instead, the odometer was recorded as 230,000 miles. At the next warranty service, the odometer returned to the trend line. This was obviously a data entry error. Other types of errors are possible, are difficult to detect, and contribute to variation around the odometer vs. date trend line.

Abnormal vehicle usage is another source of error. For example, a new vehicle may be used more frequently than an old vehicle. Why? Perhaps the owners are proud of their new vehicle. Alternatively, the operators prefer to use the inherently more reliable new vehicle. For whatever reason, the new vehicle miles/month is higher than the lifetime value desired.

Filtering the raw data can remove some of these errors. The claims should be filtered to the warranty coverage, avoiding the obvious data entry errors. For immature vehicles, it was reasonable to ignore claims that had less than 30 days in service and less than 300 miles of total usage.

Analysis Alternatives

After filtering, the highest mileage and maximum service date for each vehicle were used to calculate the maximum days in service. This provides an average mileage accumulation rate for each vehicle.

Some software logic would be required to handle cases where there were only 1, 2, … claims. A single-vehicle warranty claim does not provide much information. Two warranty claims will provide the slope of the regression line. Three or more provide the opportunity to look for deviations from the trend.

Results

If one calculated the average, the standard deviation, and applied the normal distribution, then erroneous projections and extrapolations result. Consider a vehicle type that had an average of 1297 miles/month and standard deviation of 617 miles/month.

Some studies show the mileage accumulation rate follows a lognormal distribution. My studies showd that the lognormal fit the vehicle data with about a 98% correlation while the normal fit the data with about an 89% correlation. The Weibull distribution was considered but showed a slightly lower correlation of about 96%.

Assuming the lognormal is the best, the analysis results were very different than a normal distribution. A comparison of the normal and the lognormal probability density is shown in figure 1.

PDF

Figure 1

Consider the normal distribution plot. The vehicles with negative mileage accumulation rates are predicted. Below 500 miles/month, the normal predicts too many vehicles. Above 2,600 miles/month, the normal predicts too few vehicles.

The lognormal distribution provides more realistic results. It is skewed to the right to include vehicles with very high usage and avoids negative predictions.

The cumulative probability distribution shows a similar story.

CDF

Figure 2

Using the normal distribution, 1.8% of the vehicle population was projected to have negative mileage accumulation rates. This is physically impossible.

Let’s compare the population percentiles predictions using the lognormal and the normal distributions.

Population Cumulative Percentile

Normal

Lognormal

1%

-138

411

5%

282

559

50%

1,297

1,171

95%

2,312

2,456

99%

2,732

3,337

Table 1

Because the data is not normal, the best metric of centrality is the 50th percentile or 1,171 miles/month, not 1,297 miles/month. If the company targets a 99th percentile usage, then the target should be 3,337 miles/month, not 2,732 miles/month.

Conclusions

Warranty is Big Data that can be processed to provide detailed information on a component, system, and vehicle failure rates and cost. Statistical views of top-level metrics focused on market segments can be developed.

Frequently, an analyst will use the normal distribution. A valid statistical analysis requires that one used the best-fit data distribution. In the case of vehicle mileage accumulation rates, the lognormal should be used.

Filed Under: Articles, Big Data & Analytics, on Tools & Techniques

About Dennis Craggs

I am a mechanical engineer, programmer, and statistician. My career spanned the aerospace, NASA and Teledyne CAE, and automotive, Ford and Chrysler, industries. After retirement, I started consulting as a reliability engineer and am writing articles on big data analytics. My primary goal is to assist young engineers and consult on product development issues.

« Reliability Stakeholders
Applying S@ Faults with a Simulator: An Introduction »

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Big Data & Analytics series Article by Dennis Craggs

Join Accendo

Receive information and updates about articles and many other resources offered by Accendo Reliability by becoming a member.

It’s free and only takes a minute.

Join Today

Recent Articles

  • Gremlins today
  • The Power of Vision in Leadership and Organizational Success
  • 3 Types of MTBF Stories
  • ALT: An in Depth Description
  • Project Email Economics

© 2025 FMS Reliability · Privacy Policy · Terms of Service · Cookies Policy