Why Simple Experimentation Typically Fails

**(and Why Design of Experiments is so Superior)**

In my 30-year career as an Industrial Statistics consultant, I have frequently been told by clients that they have performed Design of Experiments (DOEs), to try and resolve design or manufacturing issues. What has become clear is that many engineers and scientists apply a rather liberal definition to DOE and include any type of experimentation in what they deem to be “DOE”.

The reality is, simplistic or haphazard “experiments” rarely are effective in solving problems, especially complex ones. Statistically based DOE provides several advantages over more simplistic approaches such “trial and error” or “one-factor-at-a-time” experimentation. These advantages include:

The use of statistical methodology (hypothesis testing) to determine which factors have a statistically significant effect on the response(s)
Balanced experimental designs to allow stronger conclusions with respect to cause-and-effect relationships (as opposed to just finding correlations)
The ability to understand and estimate interactions between factors
The development of predictive models that are used to find optimal solutions for one or more responses

Each of these advantages are discussed in a bit more detail below.

Testing for Statistical Significance

All experiments involve manipulating a variable (we’ll call it a factor) and observing the change in the outcome (we’ll call this a response). Anyone involved in manufacturing knows that variation occurs in key process/product characteristics even when we aren’t intentionally changing anything! Yet, when conducing “experiments”, many engineers and scientists forget this and assume that any observed change in the response must be due to the deliberate change in the factor. While statisticians are trained to weigh any apparent change in response (after manipulating a factor) by considering the experimental error (or noise) in the experiment, most engineers and scientists do not think this way. Thus, it’s easy to assume cause and effect relationships are present when none exist. With simplistic experimentation approaches such as trial and error, it is extremely easy to simply believe whatever is tried must have caused any observed change in the outcome. Yet, statistical methodology must be used to obtain valid results. DOE uses Hypothesis Testing when determining whether the effects of factors (and interactions) on the response are statistically significant. Thus, the models that we develop only include factors that are predictive with high confidence.

Uncovering Cause and Effect Relationships – Not just Correlations

Another advantage of formal DOE is that the designs naturally are balanced to avoid confusing and invalid conclusions. Very often people will look at data and conclude cause and effect relationships are present simply based on correlations in the data. Correlation just means that a relationship exists, not necessarily that one event causes the other.

Consider an analysis that was done to relate the expenditures on medical care to the expenditures on milk based on surveys of families. The scatter plot below summarizes the relationship that was observed.

A xy plot of Milk purchases verse medical bills, with a positive linear relationship.

It would seem from this graph that more money that is spent on milk, the higher the medical costs are! Does milk actually cause health issues? Do we all need to switch to almond milk?

The problem with simply taking existing data and trying to find relationships in the data, is that we do not know if all other factors that may explain the relationship were controlled for. Upon further analysis of the survey data it became clear that family size had a big part in explaining this relationship. That is, people who live alone tend to spend relatively little on medical care and milk, and very large families tend to spend a lot on both items. Moderately large families are in the middle on both types of expenses. So, if we don’t control for family size, we may simply conclude that milk consumption leads to higher medical costs.

DOE Methodology takes great care to avoid these issues. For factors that are included in the experiment, we ensure that we mix up the levels for each factor so that we don’t only collect data when two different factors are at the same level. This would result in confounding (confusion) as to who caused the effect. Said another way, we balance the design so that we can isolate the true impacts of each factor and interaction (more on interactions later). For factors that we feel may affect the response but are not included in the study, we try to hold those as fixed as possible during the study, so they do not influence the results.

Understanding and Modeling Interactions

To solve complex problems, an understanding of factor interactions is critical. If interactions did not exist, most problems would be relatively easy to solve. So, what are interactions? We say that factors interact with each other when the effect that one factor has on the response depends on where another factor is set. Some examples include:

The effect of curing time on bond strength depends on the temperature level
The effect of a driver with the best technology on overall golf score depends on the length of the course
The effect of medication dosage on blood pressure depends on the size and weight of the person taking it

Suppose we perform a simple study of the impact that machine speed has on a critical characteristics (a radius). The engineer performs the study and summarizes the results in the graph below:

an xy plot showing relationship between machine speed and radius - with a apparent relationship the the higher the machine speed the smaller the radius.

Thus, it appeared that increasing machine speed causes a decrease in the radius. However, a different process engineer repeated the study (using the same range of machine speeds), but this time the results looked as follows:

an xy plot of machine speed and radius showing the higher machine speed results in larger radius.

So, we observed a completely different result! Now the radius increases with machine speed. What is going on? After some investigation of the two different trials, it was discovered that they were conducted under different conditions. Specifically, the first engineer ran the machine at a pressure of 120 psi and the second engineer ran the machine at a pressure of 80 psi. Putting the results together we have:

an xy plot with both the sets of measurements from the two previous plots - with the increasing radius data labeled 80psi and the decreasing radius data labeled 120 psi

Now we can see that the effect of Machine Speed on the Radius depends on the pressure! Machine Speed and Pressure are involved in a 2-factor interaction. This plot is called an “Interaction Plot” and is used to describe significant interaction effects.

Interactions are present everywhere and both a qualitative and quantitative understanding of their impacts on important outcomes (responses) must be developed. Our predictive models must include terms that account for interactions between factors to accurately model responses as a function of significant predictors. Many times, a factor will have an overall (main) effect on the response, but also be involved in an interaction with another factor. When applied correctly, DOE is extremely effective at uncovering both the main effects and interaction effects that impact the responses. Note that in simple approaches (like one factor at a time experimentation) it’s impossible to understand and model factor interactions.

Developing Predictive Models

An important advantage of structured DOE is that we can develop predictive (mathematical) models that relate the factors and interactions to one or more responses. This is very powerful especially when we are trying to jointly optimize multiple (often conflicting) responses. Note that not all types of “DOEs” allow the development of predictive models. For example, Taguchi experiments may be useful for finding setups that minimize the impact of noise factors that are uncontrollable, but the outputs are not conducive to the development of a predictive model that can be optimized.

An example of a predictive model that predicts distortion of a glass window during a molding process is:

Estimated Distortion = 0.3725 + 0.09 GlassTemp – 0.04771 PackTime + 0.13271 MoldTemp + 0.09896 (GlassTemp)(MoldTemp) – 0.07083 (PackTime)(MoldTemp) – 0.04021 (GlueThick)(PackTime)

The model is an equation that predicts a response as a function of one or more factor effects and interactions effects. In the model above, GlassTemp has a significant main effect but also interacts with MoldTemp. The coefficients in front of the main factors behave like a slope in a liner model as they indicate the degree of the effect that the factor has on the response. The coefficients in front on the interaction terms (cross products) are a bit more difficult to interpret, so it’s easier to use an interaction plot. Below is an interaction plot for the Glass Temp*Mold Temp interaction.

An interaction plot showing the effect of mold temperature on distortion as dependent on glass temperature.

Here we can see how the effect of Mold Temperature on Distortion is highly dependent on the Glass Temperature. For colder glass (25 degrees), the effect of Mold Temperature increasing is slight, but for hotter glass (105 degrees), increasing the Mold temperature has a relatively big impact on Distortion.

Using models and find good solutions to our problem and optimize response(s) will be discussed in an upcoming article. My message here is that conducting an experiment without ending up with a predictive model is not typically a good use of resources.

Summary

In this article, I hope that it’s clear just how different structured, statistically based experiments are compared to more simplistic approaches. Unfortunately, some engineers and scientists assume or believe that “proper” DOE is too complicated or expensive. Yet, with a bit of training and readily available software, effective DOE is very efficient and accessible.

(and Why Design of Experiments is so Superior)

Testing for Statistical Significance

Uncovering Cause and Effect Relationships – Not just Correlations

Developing Predictive Models

About Steven Wachs

Leave a Reply Cancel reply

**(and Why Design of Experiments is so Superior)**