Chapter 16. Analysing experiments with multiple factors

This chapter focuses on designing and analysing multi-factor experiments.

In an earlier chapter, we discussed the design and analysis of 1-Factor experiments. We described a ‘factor’ as series of groups (‘treatments’) that differ in a systematic way. We considered an example 1-Factor experiment with three levels: ‘drug applied’, ‘placebo applied’ and ‘nothing applied’. It is very often useful to design experiments with more than one factor, however, which we address in this Chapter.

For example, imagine we wanted to test the effects of a drug (one Factor) in both females and males. One approach would be to conduct a 1-Factor experiment with 6 levels that comprised all combinations of Drug and Sex; i.e., ‘drug applied to females’, ‘drug applied to males’, ‘placebo applied to females’, ‘placebo applied to males’, ‘Nothing applied to females’, and ‘Nothing applied to males’. Alternatively, one could conduct one experiment that included two factors: Drug and Sex. This chapter focuses on the latter approach, designing and analysing multi-factor experiments.

We focus on experiments with two factors, but it is possible and sometimes desirable to conduct experiments with more factors. For example, I conducted a three-factor experiment during my PhD. The principles presented in this Chapter for two-factor experiments also apply to experiments with more factors.

A word of warning, however: adding more factors to an experiment can make results more difficult to interpret – imagine your frustration if you design and complete an experiment with many factors, but cannot make sense of the results!

Multi-factor experiments allow simultaneous tests of multiple hypotheses. For example, if designed appropriately, an experiment with the two factors, Drug and Sex, would allow simultaneous tests of three hypotheses:

Does the first factor (say, arbitrarily, Drug) affect the dependent- (i.e., y-) variable after accounting for effects of the second factor (Sex)?
Does the second factor (Sex) affect the dependent- (i.e., y-) variable after accounting for effects of the first factor (Drug)?
Does the effect of one factor (say, Drug) depend on the level of the other factor (say, Sex)?

These biological questions differ qualitatively from questions that a 1-Factor experiment would typically address. Specifically, a 1-factor experiment allows comparison among levels of a single factor, without accounting for additional information. By contrast, hypotheses that address the ‘main effects’ of a multi-Factor analysis (hypotheses I & ii) allows a researcher compare levels of one main effect (say, Drug) while averaging over the effects of the other main(s) effect (say, Sex). This allows a researcher to make comparisons among levels of one factor while controlling for effects of another factor(s) in the model. Further, the third hypothesis, above, introduces another very different and useful perspective / hypothesis, which we discuss more in a moment.

The preceding paragraph highlights that different experimental designs allow researchers to address qualitatively different biological questions. Hence, understanding experimental design and data analysis, generally, increases the diversity of biological questions a researcher may ask. (On a personal note, this insight was what turned me on to understanding experimental design and data analysis. As I learned more about experimental design I became aware of more types of biological questions I could address, which was exciting and empowering.)

The third hypothesis (iii), above, is called an ‘interaction’. Generally speaking, ‘interactions’ are not more biologically important than the first two hypotheses (i & ii) regarding ‘main effects’. However we will spend more time thinking about interactions than main effects in this Chapter because interpreting interactions can be subtler. In the context of the 2-Factor, Drug and Sex experiment, above, evidence for an interaction between Drug and Sex would imply that the effect of Drug treatments differ between females and males.

Similarly, the reverse would also be true: evidence for an interaction would imply that the differences between Sexes would vary among the levels of the factor, Drug. Clearly, evidence for an interaction between Sex and Drug in this experiment would be biologically interesting: it would shed light on differences between females and males relevant to drug development. Indeed, evidence indicates that drugs do affect females and males differently, with important consequences for medicine and society.

As an example, this Nature article discusses several areas of basic neuroscience in which careful consideration of SABV has led to critical discoveries of the ways in which fundamental neurobiological processes differ in males and females:

experimental data chapter 15-1 (1.08 MB / PDF)

Studies of ecology and evolution often think in terms of ‘interactions’. For example, these fields recognize that the effect of a gene (or allele) will often depend on the environment. This phenomenon can lead to ‘local adaptation’, where members of a population perform better in their ‘home’ environment than in an ‘away’ environment because natural selection favours different alleles in different environments. Taken to an extreme, this process of local adaptation can help generate new species.

Alternatively, we may view ‘females’ and ‘males’ as different ‘environments’. For example, an allele might increase fitness in one environment (e.g., when the allele occurs in females) but decrease fitness when the allele occurs in the other environment (males). This interaction between Genotype and Sex is thought to underlie the evolution of sexual dimorphism. Consider an ecological example: the amount of pollen removed from or delivered to a flower will likely depend on the ‘fit’ between pollinator’s morphology and a flower’s shape. Hence, we expect that the quantity of pollen delivered to and removed from a flower will involve an interaction between floral morphology and pollinator morphology.

Please note: Here, and throughout this website, we define ‘females’ and ‘males’ by an individual’s complement of sex chromosomes. For example, in humans, we define males and females as individuals that do or do not possess a Y chromosome, respectively (the Y chromosome contains the SRY gene which triggers different developmental events). By contrast, females and males butterflies typically have ZW and ZZ sex chromosomes, respectively.

We define females and males in this way because, usually, the biological hypotheses that address differences between females and males (define here as ‘Sex’) aim to understand consequences of genetics differences at their ‘sex determining regions’ (often sex chromosomes) for a biological phenomenon of interest. In other words, this definition of Sex matches the biological hypotheses under investigation. We note that this definition of females and males (Sex) need not correspond to concepts of ‘Gender’. If we use the term ‘Gender’ in this website, we have done so in error and will correct the mistake.

These examples help explain why statistical analyses of ecological and evolutionary data commonly consider ‘interactions’ between variables. On the other hand, my experience suggests that the field of Biomedical Sciences less frequently addresses hypotheses involving interactions. If true, it would be interesting to understand why this is, because, arguably, understanding interactions would be equally important in Biomedical Sciences (e.g., a mutation might have different effects among mouse strains or between Sexes).

I hope that this chapter, in some small way, stimulates more ‘interaction-based’ hypotheses being asked in Biomedical Sciences, where appropriate.

Multi-factor models can also allow an analysis to account for features of experimental design. For example, an experiment that includes ‘blocking’ (See the Chapter, ‘Experimental Design’) might include ‘Block’ as a factor in the analysis to account for unwanted (nuisance) variation. Alternatively, an experiment might intentionally ‘heterogenize’ environments to determine whether a study’s results generalize across researchers and environments Again, ‘heterogenization’ would involve an analysis of ‘Blocks’.

Here are a couple of examples of these experiments:

View article in Ecology publication by Newman et al 1997 (81.46 KB / PDF)

View article in PLOS Biology by von Kortzfleisch et al 2022 (1.33 MB / PDF)

The videos in this Chapter begin by comparing 1- vs. 2-Factor GLMs. We then discuss assumptions of 2- (i.e., multi-) factor GLMs, and highlight the issue of non-independence in the experimental design. Note that the concepts of experimental design covered in previous chapters (including principles covered in the Chapter addressing 1-factor GLMs) apply to multi-factor GLMs. With respect to experimental design, we add that a 2-factor GLM can only include an ‘interaction term’ if the experiment has appropriate replication, as discussed in the video, ‘Introductory Example Analysis’.

Following this introductory example, we explain how to interpret coefficients for a 2-factor GLM, get practice interpreting interactions, and consider how unbalanced data can affect how we calculate p-values. We then consider two example datasets that illustrate analysis and interpretation when we do (Pyruvate kinase example) and do not (Sexual conflict example) identify good evidence for an interaction between main effects.

Here, and throughout this website, we define ‘females’ and ‘males’ by an individual’s complement of sex chromosomes. For example, in humans, we define males and females as individuals that do or do not possess a Y chromosome, respectively (the Y chromosome contains the SRY gene which triggers different developmental events). By contrast, females and males butterflies typically have ZW and ZZ sex chromosomes, respectively. We define females and males in this way because, usually, the biological hypotheses that address differences between females and males (define here as ‘Sex’) aim to understand consequences of genetics differences at their ‘sex determining regions’ (often sex chromosomes) for a biological phenomenon of interest. In other words, this definition of Sex matches the biological hypotheses under investigation. We note that this definition of females and males (Sex) need not correspond to concepts of ‘Gender’. If we use the term ‘Gender’ in this website, we have done so in error and will correct the mistake.

Multi-Factor GLM: Introduction

A general introduction General Linear Models: A comparison between 1- and 2_Factors GLMs

Experimental data - two or more factors introduction (370.94 KB / PPTX)

Transcript - two or more factors introduction (12.25 KB / TXT)

Multi-Factor GLM: Assumptions

A brief description of assumptions for 2-Factor GLMs; an emphasis on experimental design and independence

Experimental data - assumptions (73.6 KB / PPTX)

Transcript - assumptions (10.14 KB / TXT)

Multi-Factor GLM: Introductory Example Analysis

An introductory analysis of a simulated dataset. Discussion of i) how data are organized; ii) plotting data with 2 Factors; iii) checking assumptions and model fit; iv) calculating p-values; v) interaction plots; vi) calculating effect sizes with an interaction; v) when interactions can (based on experimental design) and should be included in a model.

Please note that this video needs to be updated to make use of the third residual plot (where the square root of standardized residuals appears on the y-axis) and to interpret p-values in terms of strength of evidence for an effect (rather than by comparing a p-value vs. 0.05).

Experimental data - multi factor intro example (59.68 KB / PPTX)

Transcript - multi factor intro example (36.4 KB / TXT)

Multi-Factor GLM: Interpret coefficients

A guide to interpreting the coefficients from a 2-Factor GLM

Transcript - interpret coefficients (10.23 KB / TXT)

Multi-Factor GLM: Visualizing Interactions

Some practice thinking about Interactions.

Experimental data - visual interactions (352.44 KB / PPTX)

Transcript - visual interactions (19.22 KB / TXT)

Multi-Factor GLM: Unbalanced data

Datasets with unbalanced data require extra care; this video outlines how to calculate p-values for unbalanced data

Transcript - unbalanced data (13.24 KB / TXT)

Multi-Factor GLM: Example pyruvate kinase

This video walks through an analysis of Dawson et al. (2020)'s analysis of pyruvate kinase in 7 species of birds, to understand how populations adapt to life at high elevation. The data are unbalanced; we find evidence for an interaction and interpret the data in this light.

Please note that this video makes use of the notion of “statistical significance”. The p-value for the interaction equals 2.287*10^-6. The video describes this result as statistically significant, but instead we should interpret this as ‘strong evidence’ for an effect (i.e. for an interaction).

Please note a ’typo’ in my speaking (a ‘speako’?). When describing the interaction, I said that the result meant that ‘the particular value of pyruvate for a given species will depend on the altitude we’re considering’; I meant to say that the difference between species will depend on the altitude we consider.

This video needs to be updated to discuss the third residual plot (with square root of standardized residuals on the y-axis). The video skips past this plot very quickly, but if you hit ‘pause’ at the right moment you’ll see that, for the untransformed data, the red line steadily increases from left to right, reflecting the trend in the residual points: this provides even better evidence than the first residual plot that the data violate the assumption of equal variance. Please note that for the third residual plot (not discussed in the video, although it should be) looks good for the log-transformed data, indicating that log-transformation has resolved the issue of unequal variance.

Transcript - example pyruvate kinase 7 (38.04 KB / TXT)

Article in e-life Dawson et al (2020) (3.47 MB / PDF)

Multi-Factor GLM: Example sexual conflict

In this video we analyse data from a 2-Factor experiment of sexual conflict. This example illustrates a case where an interaction between Factors does not occur. It also emphasises the interpretation of effect sizes.

Note that this video needs to be updated with respect to interpreting the residual plots. Specifically, the video fails to consider the third residual plot, with the square root of standardized residuals on the y-axis. For this dataset, this third plot suggests the raw (untransformed) data may sufficiently meet the assumption of equal variance - although, you will note that the red line could be a bit more flat, and the points are not as centred around the red line as we’d like. However, recall that unequal variance tends to decrease p-values: as we obtain a large p-value for the interaction, and given that the focal hypothesis for this analysis concerns the interaction directly, we can be confident that unequal variance (if it occurs here) will not change our most crucial conclusion concerning the interaction (because its p-value is large, anyway).

This video also need to be updated with respect to comparing the p-values against the arbitrary threshold of p=0.05. Note that the p-value for the interaction equals 0.55, which we should interpreted as (at most) weak evidence for an interaction. The p-values for ‘treatment’ and ‘fertility’ equal <2.2*10^-16 and 0.00013, respectively. The former constitutes strong evidence for an effect, and the latter substantial to strong evidence for an effect.

Finally, please note that the original publication from which these data came (doi:10.1098/rspb.2008.0139) analyses survival using ‘proportional hazards’ (a different technique than we use here). The authors do not clarify whether their survival analyses accounted for pseudo-replication that could arise from housing three females per vial (although they did account for pseudo-replication in other analyses). We could not account for possible pseudo-replication in our analyses because the available dataset did not indicate which female was housed in which vial. Therefore, we simply pretend that pseudo-replication was not possible in this experiment (asa learning exercise).

Experimental data -sexual conflict (255.76 KB / PPTX)

Transcript - sexual conflict (34.86 KB / TXT)

Article in Proceedings of the Royal Society. Barnes et al (2008) (172.87 KB / BARNES ET AL (2008))