Accendo Reliability

Your Reliability Engineering Professional Development Site

  • Home
  • About
    • Contributors
    • About Us
    • Colophon
    • Survey
  • Reliability.fm
  • Articles
    • CRE Preparation Notes
    • NoMTBF
    • on Leadership & Career
      • Advanced Engineering Culture
      • ASQR&R
      • Engineering Leadership
      • Managing in the 2000s
      • Product Development and Process Improvement
    • on Maintenance Reliability
      • Aasan Asset Management
      • AI & Predictive Maintenance
      • Asset Management in the Mining Industry
      • CMMS and Maintenance Management
      • CMMS and Reliability
      • Conscious Asset
      • EAM & CMMS
      • Everyday RCM
      • History of Maintenance Management
      • Life Cycle Asset Management
      • Maintenance and Reliability
      • Maintenance Management
      • Plant Maintenance
      • Process Plant Reliability Engineering
      • RCM Blitz®
      • ReliabilityXperience
      • Rob’s Reliability Project
      • The Intelligent Transformer Blog
      • The People Side of Maintenance
      • The Reliability Mindset
    • on Product Reliability
      • Accelerated Reliability
      • Achieving the Benefits of Reliability
      • Apex Ridge
      • Field Reliability Data Analysis
      • Metals Engineering and Product Reliability
      • Musings on Reliability and Maintenance Topics
      • Product Validation
      • Reliability by Design
      • Reliability Competence
      • Reliability Engineering Insights
      • Reliability in Emerging Technology
      • Reliability Knowledge
    • on Risk & Safety
      • CERM® Risk Insights
      • Equipment Risk and Reliability in Downhole Applications
      • Operational Risk Process Safety
    • on Systems Thinking
      • Communicating with FINESSE
      • The RCA
    • on Tools & Techniques
      • Big Data & Analytics
      • Experimental Design for NPD
      • Innovative Thinking in Reliability and Durability
      • Inside and Beyond HALT
      • Inside FMEA
      • Institute of Quality & Reliability
      • Integral Concepts
      • Learning from Failures
      • Progress in Field Reliability?
      • R for Engineering
      • Reliability Engineering Using Python
      • Reliability Reflections
      • Statistical Methods for Failure-Time Data
      • Testing 1 2 3
      • The Manufacturing Academy
  • eBooks
  • Resources
    • Accendo Authors
    • FMEA Resources
    • Glossary
    • Feed Forward Publications
    • Openings
    • Books
    • Webinar Sources
    • Podcasts
  • Courses
    • Your Courses
    • Live Courses
      • Introduction to Reliability Engineering & Accelerated Testings Course Landing Page
      • Advanced Accelerated Testing Course Landing Page
    • Integral Concepts Courses
      • Reliability Analysis Methods Course Landing Page
      • Applied Reliability Analysis Course Landing Page
      • Statistics, Hypothesis Testing, & Regression Modeling Course Landing Page
      • Measurement System Assessment Course Landing Page
      • SPC & Process Capability Course Landing Page
      • Design of Experiments Course Landing Page
    • The Manufacturing Academy Courses
      • An Introduction to Reliability Engineering
      • Reliability Engineering Statistics
      • An Introduction to Quality Engineering
      • Quality Engineering Statistics
      • FMEA in Practice
      • Process Capability Analysis course
      • Root Cause Analysis and the 8D Corrective Action Process course
      • Return on Investment online course
    • Industrial Metallurgist Courses
    • FMEA courses Powered by The Luminous Group
    • Foundations of RCM online course
    • Reliability Engineering for Heavy Industry
    • How to be an Online Student
    • Quondam Courses
  • Calendar
    • Call for Papers Listing
    • Upcoming Webinars
    • Webinar Calendar
  • Login
    • Member Home
  • Barringer Process Reliability Introduction Course Landing Page
  • Upcoming Live Events
You are here: Home / Articles / Siegel-Tukey Test for Differences in Scale

by Fred Schenkelberg 2 Comments

Siegel-Tukey Test for Differences in Scale

Siegel-Tukey Test for Differences in Scale

There are a few different reasons we explore differences in scale.

Keep in mind that the scale of a dataset is basically the spread of the data. For most datasets, we’re examining the variance.

Hypothesis tests comparing means vary depending on the assumption of equal variances. Thus testing that assumption requires methods to adequately test the homogeneity of variances. The F-test should come to mind as it is a common approach.

Some datasets do not lend themselves to using the F-test, which is applicable using real numbers. Some datasets gather information that is ordinal or interval data, thus we need another approach to test for differences in scale.

A Non-Parametric Difference in Scale Test

The Siegal Turkey test is a non-parametric test for the comparison of data that is at least ordinal. It also provides a nonparametric method for interval, integer, and real data.

There are 4 steps:

  1. Define the null and alternate hypotheses
  2. Combine the data and arrange from smallest to largest
  3. Assign rank order alternating smallest to largest then largest to smallest
  4. Apply the Wilcoxon rank sum test

The key here is the ranking assignments of the data. If the spread of the two groups of data are the same then both groups will have values at the extremes of the combine set in equal proportions.

If not, then one group will end up with a small set of rank values than the other group, thus allowing the Wilcoxon rank sum test to detect the difference.

Step 1 Define the hypothesis test

Let’s examine some temperature data gathered over a week of operation.

There are two lines in our factory, line A and line B. The chamber temperature at one stage of the process is critical to the reliability of the final product.

We suspect line A has less variation in temperature then line B.

The null hypothesis is the two lines and the variation of the temperatures are the same.

$$ \displaystyle\large {{H}_{0}}:\sigma _{A}^{2}=\sigma _{B}^{2}$$

The alternative hypothesis is line A has a less spread of temperatures then line B.

$$ \displaystyle\large {{H}_{1}}:\sigma _{A}^{2}>\sigma _{B}^{2}$$

Underlying the test, we are also assuming the two groups have equivalent medians.

Step 2 Combine and Sort the Data

The n temperature readings in Fahrenheit for line A are:
44, 71, 95, 69, 67, 82, 86

The m temperature readings for line B are:
24, 27, 79, 62, 75, 87

The median of line A is 71 and for line B it is 68.5.

The median of line A is 71 and for line B it is 68.5.

The combined 13 reading in smallest to largest order is:
24, 27, 44, 55, 59, 62, 71, 74, 77, 82, 86, 87, 95

Maintain which reading is from which line. The ordered readings are from lines A or B as follows (here a table may be helpful):

Line B B A B A A A B B A A B A
Temp 24 27 44 62 67 69 71 75 79 82 86 87 95

Step 3 Assign Rank Using Alternating Method

Next assign the rank of 1 to the smallest value, in this case, 24. The assign the rank of 2 and 3 to the two largest values, in this case, 95, and 87, respectively.

Line B B A B A A A B B A A B A
Temp 24 27 44 62 67 69 71 75 79 82 86 87 95
Rank 1 3 2

Continue assigning ranks alternating the next two smallest, then next two largest till all values have a rank assigned.

Line B B A B A A A B B A A B A
Temp 24 27 44 62 67 69 71 75 79 82 86 87 95
Rank 1 4 5 8 9 12 13 11 10 7 6 3 2

Step 4 Apply the Wilcoxon rank sum test

The Wilcoxon rank-sum test uses the sum of the ranks for the two groups. The rank sum for line A and B readings are:

$$ \displaystyle\large \begin{array}{l}{{W}_{A}}=5+9+12+13+7+6+2=54\\{{W}_{B}}=1+4+8+11+10+3=37\end{array}$$

If the null hypothesis is true the two sums should be about the same given the sizes of the two groups of data.

Part of the Wilcoxon rank sum test is subtracting the minimum possible rank sum from each total. The minimum value is found using

$$ \displaystyle\large {n\left( n+1 \right)}/{2}\;$$

The U statistics for this set of data are:

$$ \displaystyle\large \begin{array}{l}{{U}_{A}}={54-7\left( 7+1 \right)}/{2}\;=26\\{{U}_{B}}=37-{6\left( 6+1 \right)}/{2}\;=16\end{array}$$

The critical value is found with a U-table given the n12

n1 = 6

n2 0.1% 0.5% 1% 2.5% 5% 10%
6 — 23 24 26 28 30
7 21 24 25 27 29 32
8 22 25 27 29 31 34
9 23 26 28 31 33 36
10 24 27 29 32 35 38
11 25 28 30 34 37 40
12 25 30 32 35 38 42
13 26 31 33 37 40 44
14 26 32 34 38 42 46
15 28 33 36 40 44 48
16 29 34 37 42 46 50
17 30 36 39 43 47 52
18 31 37 40 45 49 55
19 32 38 41 46 51 57
20 33 39 43 48 53 59

The critical value with a one-sided 5% significance is 29.

Note: the paper has other tables for n1 from 1 through 20. With more than 10 samples per group, there is a normal approximation approach available.

Since the test statistic, 16 in this example, is less than the critical value, 29, we do have convincing evidence to conclude the two groups of readings have different variances.

A visual check confirms the variances are different.

Siegal-Tukey-boxplot
Have you checked if your variances are the same or not? What is your go-to test?

Filed Under: Articles, CRE Preparation Notes, Probability and Statistics for Reliability Tagged With: Statistics non-parametric

About Fred Schenkelberg

I am the reliability expert at FMS Reliability, a reliability engineering and management consulting firm I founded in 2004. I left Hewlett Packard (HP)’s Reliability Team, where I helped create a culture of reliability across the corporation, to assist other organizations.

« Support Systems Improve Efficiency
Unconscious Risk Management »

Comments

  1. fabio says

    September 18, 2022 at 1:15 AM

    Dear Fred, I tried to compute your example by my own and I got to the opposite conclusion. The U critical value for n1 = 6 and n2 = 7 is 8, thus by the Mann Whitney U test, the difference in spread is not statistically significant. Are you sure that your table of critical values concerns the Mann Whitney U test statistics? Thank you in advance for your kind reply

    Reply
    • Fred Schenkelberg says

      September 19, 2022 at 11:38 AM

      Hi Fabio,

      I’ll have to do some checking on this and it’s very possible I did make a mistake. Pretty sure the table is appropriate as it’s taken directly from the 1960 paper A Nonparametric Sum of Ranks Procedure for Relative Spread in Unpaired Samples
      Author(s): Sidney Siegel and John W. Tukey
      Source: Journal of the American Statistical Association, Vol. 55, No. 291 (Sep., 1960), pp.429-445 the table is on page 437

      where do our calculations differ?

      cheers,

      Fred

      Reply

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

CRE Preparation Notes

Article by Fred Schenkelberg

Join Accendo

Join our members-only community for full access to exclusive eBooks, webinars, training, and more.

It’s free and only takes a minute.

Get Full Site Access

Not ready to join?
Stay current on new articles, podcasts, webinars, courses and more added to the Accendo Reliability website each week.
No membership required to subscribe.

[popup type="" link_text="Get Weekly Email Updates" link_class="button" ][display_form id=266][/popup]

  • CRE Preparation Notes
  • CRE Prep
  • Reliability Management
  • Probability and Statistics for Reliability
  • Reliability in Design and Development
  • Reliability Modeling and Predictions
  • Reliability Testing
  • Maintainability and Availability
  • Data Collection and Use

© 2025 FMS Reliability · Privacy Policy · Terms of Service · Cookies Policy