Accendo Reliability

Your Reliability Engineering Professional Development Site

  • Home
  • About
    • Contributors
    • About Us
    • Colophon
    • Survey
  • Reliability.fm
  • Articles
    • CRE Preparation Notes
    • NoMTBF
    • on Leadership & Career
      • Advanced Engineering Culture
      • ASQR&R
      • Engineering Leadership
      • Managing in the 2000s
      • Product Development and Process Improvement
    • on Maintenance Reliability
      • Aasan Asset Management
      • AI & Predictive Maintenance
      • Asset Management in the Mining Industry
      • CMMS and Maintenance Management
      • CMMS and Reliability
      • Conscious Asset
      • EAM & CMMS
      • Everyday RCM
      • History of Maintenance Management
      • Life Cycle Asset Management
      • Maintenance and Reliability
      • Maintenance Management
      • Plant Maintenance
      • Process Plant Reliability Engineering
      • RCM Blitz®
      • ReliabilityXperience
      • Rob’s Reliability Project
      • The Intelligent Transformer Blog
      • The People Side of Maintenance
      • The Reliability Mindset
    • on Product Reliability
      • Accelerated Reliability
      • Achieving the Benefits of Reliability
      • Apex Ridge
      • Field Reliability Data Analysis
      • Metals Engineering and Product Reliability
      • Musings on Reliability and Maintenance Topics
      • Product Validation
      • Reliability by Design
      • Reliability Competence
      • Reliability Engineering Insights
      • Reliability in Emerging Technology
      • Reliability Knowledge
    • on Risk & Safety
      • CERM® Risk Insights
      • Equipment Risk and Reliability in Downhole Applications
      • Operational Risk Process Safety
    • on Systems Thinking
      • Communicating with FINESSE
      • The RCA
    • on Tools & Techniques
      • Big Data & Analytics
      • Experimental Design for NPD
      • Innovative Thinking in Reliability and Durability
      • Inside and Beyond HALT
      • Inside FMEA
      • Institute of Quality & Reliability
      • Integral Concepts
      • Learning from Failures
      • Progress in Field Reliability?
      • R for Engineering
      • Reliability Engineering Using Python
      • Reliability Reflections
      • Statistical Methods for Failure-Time Data
      • Testing 1 2 3
      • The Manufacturing Academy
  • eBooks
  • Resources
    • Accendo Authors
    • FMEA Resources
    • Glossary
    • Feed Forward Publications
    • Openings
    • Books
    • Webinar Sources
    • Podcasts
  • Courses
    • Your Courses
    • Live Courses
      • Introduction to Reliability Engineering & Accelerated Testings Course Landing Page
      • Advanced Accelerated Testing Course Landing Page
    • Integral Concepts Courses
      • Reliability Analysis Methods Course Landing Page
      • Applied Reliability Analysis Course Landing Page
      • Statistics, Hypothesis Testing, & Regression Modeling Course Landing Page
      • Measurement System Assessment Course Landing Page
      • SPC & Process Capability Course Landing Page
      • Design of Experiments Course Landing Page
    • The Manufacturing Academy Courses
      • An Introduction to Reliability Engineering
      • Reliability Engineering Statistics
      • An Introduction to Quality Engineering
      • Quality Engineering Statistics
      • FMEA in Practice
      • Process Capability Analysis course
      • Root Cause Analysis and the 8D Corrective Action Process course
      • Return on Investment online course
    • Industrial Metallurgist Courses
    • FMEA courses Powered by The Luminous Group
    • Foundations of RCM online course
    • Reliability Engineering for Heavy Industry
    • How to be an Online Student
    • Quondam Courses
  • Calendar
    • Call for Papers Listing
    • Upcoming Webinars
    • Webinar Calendar
  • Login
    • Member Home
  • Barringer Process Reliability Introduction Course Landing Page
  • Upcoming Live Events
You are here: Home / Articles / The Wald Wolfowitz Run Test for Two Small Samples

by Fred Schenkelberg 6 Comments

The Wald Wolfowitz Run Test for Two Small Samples

This nonparametric test evaluates if two continuous cumulative distributions are significantly different or not.

For example, if the assumption is two production lines producing the same product create the same resulting dimensions, comparing a set of samples from each line may reveal if that hypothesis is true or not.

The concepts looks at an ascending ordered list of the data from the two distributions (in our case lines) keeping an identifier with the data on which line the data point originated.

Then we count the number of times the data changes from one line to the other.

If too few changes or runs then the two samples most likely come from different distributions.

Small samples here means the number of samples from each source is 10 or less. With more samples, we can use a normal distribution approximation of the expected count of runs and will discuss that in a separate article.

 

Example: Determine the number of runs

Let’s say we have two sets of 8 samples from the two production lines and we measure output voltage at a key test point of the individual products (of course you should measure something of importance to the final product quality or reliability).

We have line A and line B with the following data:

A: 17.65 12.95 20.20 25.00 15.50 12.75 27.05 25.20
B: 27.45 25.10 17.95 15.70 27.25 25.30 10.30 10.90

Let’s order the values and tag each value with the line letter so we can keep track of which value is from which line.

10.30 B 10.90 B 12.75 A 12.95 A 15.50 A 15.70 B 17.65 A 17.95 B
20.20 A 25.00 A 25.10 B 25.20 A 25.30 B 27.05 A 27.25 B 27.45 B

The runs are identified as which line the data came from. So, in this example the first two, lowest, values are from line B. This is one run, those two values.

Then there are three values from line A, 12.75, 12,95, and 15.50. This is another run. And so on creating a total of 11 runs.

How many runs is too few?

Now, let’s say the two lines really were very different and produced results that were dramatically different.

We may have all the line A value centered tightly around 20 and of the values of line B centered around 50.

Ordering the values would create the 8 values of line A followed by the line B values creating two runs.

If the two lines were a little closer such that they just overlapped with two values we may have 4 runs A, B, A, B.

Yet, if that was 4 A’s, followed by 4 B’s, then 4 A’s and the remaining 4 B’s, that is pretty close to being an overlap of the two distributions.

Or is it?

The Wald Wolfowitz approach is to estimate the probability of the number of runs that may occur using (basically) a binomial distribution approach, we can tally the probability of the number of runs till we achieve a reasonable critical value to define the threshold to make a decision.

The Wald Wolfowitz 2 (small) Sample Run Test

The null hypothesis is the two samples are from the same distribution.

$$ \displaystyle\large {{H}_{o}}:F\left( x \right)=G\left( x \right)$$

The alternative hypothesis is the two samples are not from the same distribution.

$$ \displaystyle\large {{H}_{o}}:F\left( x \right)\ne G\left( x \right)$$

The test statistic takes some work to determine. We need to estimate the probability of 2 runs, then 3, or 4, or 5, etc. number of runs. We can do this till we have the number of observed runs, or reach the critical value of interest.

First the test statistic is calculated by summing the probabilities of observing the count of possible runs. For an even number of runs use:

$$ \displaystyle\large P\left( R=2k \right)=\frac{2\left( \begin{array}{l}{{n}_{1}}-1\\k-1\end{array} \right)\left( \begin{array}{l}{{n}_{2}}-1\\k-1\end{array} \right)}{\left( \begin{array}{c}{{n}_{1}}+{{n}_{2}}\\{{n}_{1}}\end{array} \right)}$$

Where R is the number of even runs and equal to 2k, where k is a positive integer. Where, n1 and  n2 are the number of samples from the two sources.

For an odd number of runs use:

$$ \displaystyle\large P\left( R=2k+1 \right)=\frac{\left( \begin{array}{c}{{n}_{1}}-1\\k\end{array} \right)\left( \begin{array}{c}{{n}_{2}}-1\\k-1\end{array} \right)+\left( \begin{array}{c}{{n}_{2}}-1\\k\end{array} \right)\left( \begin{array}{c}{{n}_{1}}-1\\k-1\end{array} \right)}{\left( \begin{array}{c}{{n}_{1}}+{{n}_{2}}\\{{n}_{1}}\end{array} \right)}$$

That is a lot of calculating when there is large number of samples, thus we’ll use a normal approximation for samples larger than 10 from each source. Yet, here we have only 8 samples from each source, thus we need to calculate the probabilities.

In this case with n1 and  n2 equal to 8 the calculation for the probability of just two runs, R = 2 and therefore k = 1, is:

$$ \displaystyle\large P\left( R=2 \right)=\frac{2\left( \begin{array}{c}8-1\\1-1\end{array} \right)\left( \begin{array}{c}8-1\\1-1\end{array} \right)}{\left( \begin{array}{c}8+8\\8\end{array} \right)}=\frac{2}{12,870}=.00016$$

And the calculation for R = 3 and therefore k=1 again is

$$ \displaystyle\large P\left( R=3 \right)=\frac{\left( \begin{array}{c}8-1\\1\end{array} \right)\left( \begin{array}{c}8-1\\1-1\end{array} \right)+\left( \begin{array}{c}8-1\\1\end{array} \right)\left( \begin{array}{c}8-1\\1-1\end{array} \right)}{\left( \begin{array}{c}8+8\\8\end{array} \right)}=\frac{14}{12,870}=.00109$$

For R = 4, k = 2, P(R = 4) = 0.00761

For R = 5, k = 2, P(R = 5) = 0.2284

And, for R = 6, k = 3, P(R = 6) = 0.06853.

Let’s tally these up and see where we are for cumulative probabilities of being equal to or less than a number of runs.

$$ \displaystyle\large P\left( R\le 3 \right)=0.00016+0.00109=.00125$$

And,

$$ \displaystyle\large P\left( R\le 4 \right)=0.00016+0.00109+0.00761=.00886$$

And,

$$ \displaystyle\large P\left( R\le 5 \right)=0.00016+0.00109+0.00761+0.02284=.00317$$

And,

$$ \displaystyle\large P\left( R\le 6 \right)=0.00016+0.00109+0.00761+0.02284+0.06853=.10023$$

We can now select our critical value or the probability of null hypothesis actually resulting in the run count observed or greater.

For example, if we would to take a relatively small risk, say a 5% risk, or 95% confidence, that the two distributions are actually different when they are actually the same, we select 0.05 as the critical value.

If the count of runs is 5 or less the test statistic is 0.0317 given 8 samples from each source, and the statistic is 0.1002 for R = 6 or less.

Thus if we actually have 5 or fewer runs we have a 95% confidence that the two sources, in this two production lines, are different.

In this case, we have 11 runs, which is creating then the 5 or fewer associated with the critical value, thus we cannot conclude there is sufficient evidence the two lines are different.

Tables to make this quicker

This approach requires quite a bit of calculation to determine the test statistic.

Yet the values are independent of the actual values measured, as we use the count of runs.

Thus, we can calculate a table for a various number of samples and specific confidence levels or risk thresholds.

Of course, this has been done already and here is one example with a critical value of 0.05 (a 95% one-sided confidence)

n1 n2 Critical R
10 10 7
10 9 6
10 8 6
10 7 6
10 6 5
10 5 4
10 4 4
10 3 3
10 2

Thus if we have 10 samples from one source, and 8 from another source, if the number of runs is 6 or less then we reject the null hypothesis the two sources create the same results (in other words they are not the same).

Note there is not critical R value for 10 and 2 samples as it is not possible to conclude with any count of runs if the two sources are different or not with 95% one-sided confidence.

n1 n2 Critical R
9 9 6
9 8 6
9 7 5
9 6 5
9 5 4
9 4 4
9 4 3
9 3
n1 n2 Critical R
8 8 5
8 7 5
8 6 4
8 5 4
8 4 4
8 3 3
8 2
n1 n2 Critical R
7 7 4
7 6 4
7 5 4
7 4 3
7 3 3
7 2
n1 n2 Critical R
6 6 4
6 5 4
6 4 3
6 3 3
6 2
n1 n2 Critical R
5 5 3
5 4 3
5 3
5 2

With fewer than 8 total samples we are not able to make a determination using this method.


Related:

When to Conduct HALT (article)

Accelerated Life Testing (article)

Reliability Testing (article)

 

 

Filed Under: Articles, CRE Preparation Notes, Probability and Statistics for Reliability Tagged With: Statistics non-parametric

About Fred Schenkelberg

I am the reliability expert at FMS Reliability, a reliability engineering and management consulting firm I founded in 2004. I left Hewlett Packard (HP)’s Reliability Team, where I helped create a culture of reliability across the corporation, to assist other organizations.

« Why do statistical based testing?
Basic Approaches to Life Testing »

Comments

  1. Vivek Namboodiripad says

    July 20, 2015 at 8:32 PM

    Hi Fred, Thanks for sharing. I am totally new to this test. Is this related to the Run test for randomness which is also mentioned in a later post? I find the approach same. I am yet to go through it fully.

    Reply
    • Fred Schenkelberg says

      July 20, 2015 at 8:59 PM

      Yes Vivek, it is very similar to the Run test described in another post. The way the data is arranged to find the runs is a little different, yet the same concept is at the heart of it. cheers, Fred

      Reply
  2. pradnya says

    April 6, 2017 at 3:35 AM

    this eg u gave contains different observations in both samples.
    please explain if there are same observations in both the samples repeated then how to calculate total no of runs?

    Reply
    • Fred Schenkelberg says

      April 6, 2017 at 2:56 PM

      Hi Pradnya, the two values would be listing in the ordering next to each other. So one from group A and the next with the same value from group B, of course you could list B then A. The run is based on which group the reading is from, so the count may change if the listing is reversed. The process is not exact yet with get you within one run count.

      For example let’s say the lowest two readings are both 10.1, and the next value in the sort is 10.5 from group B. So one way to order these is A, B B, resutling in two runs. If we reverse the listing of the first two, like ethis B, A, B, then the run count is three. Like I said, not exact.

      I would count the runs with both ordering of groups and if it doesn’t change the result, not a big deal. If the change of one run in the count changes the result, then select the more conservative result for your given situation.

      Cheers,

      Fred

      Reply
  3. Ajeet kumar says

    April 20, 2017 at 6:50 PM

    why we take critical value in this test one sided.

    Reply
    • Fred Schenkelberg says

      April 21, 2017 at 6:55 AM

      Hi Ajeet, doing a quick review of the test procedure I’m not sure it could be set up for a one-sided test. The non-parametric method is really only looking at differences and able to determine if two groups are different enough to reveal the difference in the count of reversals. If anyone has better information that would be great. cheers, Fred

      Reply

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

[popup type="" link_text="Get Weekly Email Updates" link_class="button" ][display_form id=266][/popup]

The Accendo Reliablity logo of a sun face in circuit

Please login to have full access.




Lost Password? Click here to have it emailed to you.

Not already a member? It's free and takes only a moment to create an account with your email only.

Join

Your membership brings you all these free resources:

  • Live, monthly reliability webinars & recordings
  • eBooks: Finding Value and Reliability Maturity
  • How To articles & insights
  • Podcasts & additional information within podcast show notes
  • Podcast suggestion box to send us a question or topic for a future episode
  • Course (some with a fee)
  • Largest reliability events calendar
  • Course on a range of topics - coming soon
  • Master reliability classes - coming soon
  • Basic tutorial articles - coming soon
  • With more in the works just for members
Speaking of Reliability podcast logo

Subscribe and enjoy every episode

RSS
iTunes
Stitcher

Join Accendo

Receive information and updates about podcasts and many other resources offered by Accendo Reliability by becoming a member.

It’s free and only takes a minute.

Join Today

Dare to Know podcast logo

Subscribe and enjoy every episode

RSS
iTunes
Stitcher

Join Accendo

Receive information and updates about podcasts and many other resources offered by Accendo Reliability by becoming a member.

It’s free and only takes a minute.

Join Today

Accendo Reliability Webinar Series podcast logo

Subscribe and enjoy every episode

RSS
iTunes
Stitcher

Join Accendo

Receive information and updates about podcasts and many other resources offered by Accendo Reliability by becoming a member.

It’s free and only takes a minute.

Join Today

Recent Articles

  • Gremlins today
  • The Power of Vision in Leadership and Organizational Success
  • 3 Types of MTBF Stories
  • ALT: An in Depth Description
  • Project Email Economics

© 2025 FMS Reliability · Privacy Policy · Terms of Service · Cookies Policy