What type of ANOVA is used when there are two independent variables each with more than two levels and with different participants taking part in each condition?

Published on March 20, 2020 by Rebecca Bevans. Revised on October 3, 2022.

ANOVA [Analysis of Variance] is a statistical test used to analyze the difference between the means of more than two groups.

A two-way ANOVA is used to estimate how the mean of a quantitative variable changes according to the levels of two categorical variables. Use a two-way ANOVA when you want to know how two independent variables, in combination, affect a dependent variable.

ExampleYou are researching which type of fertilizer and planting density produces the greatest crop yield in a field experiment. You assign different plots in a field to a combination of fertilizer type [1, 2, or 3] and planting density [1=low density, 2=high density], and measure the final crop yield in bushels per acre at harvest time.

You can use a two-way ANOVA to find out if fertilizer type and planting density have an effect on average crop yield.

When to use a two-way ANOVA

You can use a two-way ANOVA when you have collected data on a quantitative dependent variable at multiple levels of two categorical independent variables.

A quantitative variable represents amounts or counts of things. It can be divided to find a group mean.

Bushels per acre is a quantitative variable because it represents the amount of crop produced. It can be divided to find the average bushels per acre.

A categorical variable represents types or categories of things. A level is an individual category within the categorical variable.

Fertilizer types 1, 2, and 3 are levels within the categorical variable fertilizer type. Planting densities 1 and 2 are levels within the categorical variable planting density.

You should have enough observations in your data set to be able to find the mean of the quantitative dependent variable at each combination of levels of the independent variables.

Both of your independent variables should be categorical. If one of your independent variables is categorical and one is quantitative, use an ANCOVA instead.

How does the ANOVA test work?

ANOVA tests for significance using the F-test for statistical significance. The F-test is a groupwise comparison test, which means it compares the variance in each group mean to the overall variance in the dependent variable.

If the variance within groups is smaller than the variance between groups, the F-test will find a higher F-value, and therefore a higher likelihood that the difference observed is real and not due to chance.

A two-way ANOVA with interaction tests three null hypotheses at the same time:

  • There is no difference in group means at any level of the first independent variable.
  • There is no difference in group means at any level of the second independent variable.
  • The effect of one independent variable does not depend on the effect of the other independent variable [a.k.a. no interaction effect].

A two-way ANOVA without interaction [a.k.a. an additive two-way ANOVA] only tests the first two of these hypotheses.

Two-way ANOVA hypothesesIn our crop yield experiment, we can test three hypotheses using two-way ANOVA: Null hypothesis [H0]Alternate hypothesis [Ha]
There is no difference in average yield
for any fertilizer type.
There is a difference in average yield by fertilizer type.
There is no difference in average yield at either planting density. There is a difference in average yield by planting density.
The effect of one independent variable on average yield does not depend on the effect of the other independent variable [a.k.a. no interaction effect]. There is an interaction effect between planting density and fertilizer type on average yield.

Receive feedback on language, structure and formatting

Professional editors proofread and edit your paper by focusing on:

  • Academic style
  • Vague sentences
  • Grammar
  • Style consistency

See an example

Assumptions of the two-way ANOVA

To use a two-way ANOVA your data should meet certain assumptions.Two-way ANOVA makes all of the normal assumptions of a parametric test of difference:

  1. Homogeneity of variance [a.k.a. homoscedasticity]

The variation around the mean for each group being compared should be similar among all groups. If your data don’t meet this assumption, you may be able to use a non-parametric alternative, like the Kruskal-Wallis test.

  1. Independence of observations

Your independent variables should not be dependent on one another [i.e. one should not cause the other]. This is impossible to test with categorical variables – it can only be ensured by good experimental design.

In addition, your dependent variable should represent unique observations – that is, your observations should not be grouped within locations or individuals.

If your data don’t meet this assumption [i.e. if you set up experimental treatments within blocks], you can include a blocking variable and/or use a repeated-measures ANOVA.

  1. Normally-distributed dependent variable

The values of the dependent variable should follow a bell curve. If your data don’t meet this assumption, you can try a data transformation.

In the crop-yield example, the response variable is normally distributed, and we can check for homoscedasticity after running the model. The experimental treatments were set up within blocks in the field, with four blocks each containing every possible combination of fertilizer type and planting density, so we should include this as a blocking variable in the model.

How to perform a two-way ANOVA

The dataset from our imaginary crop yield experiment includes observations of:

  • Final crop yield [bushels per acre]
  • Type of fertilizer used [fertilizer type 1, 2, or 3]
  • Planting density [1=low density, 2=high density]
  • Block in the field [1, 2, 3, 4].

The two-way ANOVA will test whether the independent variables [fertilizer type and planting density] have an effect on the dependent variable [average crop yield]. But there are some other possible sources of variation in the data that we want to take into account.

We applied our experimental treatment in blocks, so we want to know if planting block makes a difference to average crop yield. We also want to check if there is an interaction effect between two independent variables – for example, it’s possible that planting density affects the plants’ ability to take up fertilizer.

Because we have a few different possible relationships between our variables, we will compare three models:

  1. A two-way ANOVA without any interaction or blocking variable [a.k.a an additive two-way ANOVA].
  2. A two-way ANOVA with interaction but with no blocking variable.
  3. A two-way ANOVA with interaction and with the blocking variable.

Model 1 assumes there is no interaction between the two independent variables. Model 2 assumes that there is an interaction between the two independent variables. Model 3 assumes there is an interaction between the variables, and that the blocking variable is an important source of variation in the data.

By running all three versions of the two-way ANOVA with our data and then comparing the models, we can efficiently test which variables, and in which combinations, are important for describing the data, and see whether the planting block matters for average crop yield.

This is not the only way to do your analysis, but it is a good method for efficiently comparing models based on what you think are reasonable combinations of variables.

Running a two-way ANOVA in R

We will run our analysis in R. To try it yourself, download the sample dataset.

Sample dataset for a two-way ANOVA

After loading the data into the R environment, we will create each of the three models using the aov[] command, and then compare them using the aictab[] command. For a full walkthrough, see our guide to ANOVA in R.

This first model does not predict any interaction between the independent variables, so we put them together with a ‘+’.

Two-way ANOVA R codetwo.way

Chủ Đề