Chapter 14. Comparing averages between more than two groups: 1-factor models

This chapter deals with the design and analysis of such 1-Factor experiments; the analyses are termed, 1-Factor General Linear Models (glm).

As biologists, we often design experiments to examine differences among a series of groups (‘treatments’) that differ in one systematic way.  For example, consider an experiment to test the effectiveness of a drug that includes three treatments:  ‘drug applied, ‘placebo applied and ‘nothing applied’.  We refer to the overall combination of drug-related treatments as a Factor, and each specific treatment as a level of the factor.  This chapter deals with the design and analysis of such 1-Factor experiments; the analyses are termed, 1-Factor General Linear Models (glm).

This chapter begins by asking “Why it is useful to analyse data using a 1-Factor glm?”.  For example, for the experiment described above, what disadvantages arise if we instead analyse the data using a series of t-tests?  We next discuss essential elements of experimental design for standard 1-factor experiments, including power analysis; this content extends material presented on experimental design in previous chapters. 

Note that the central concepts for experimental design discussed in this chapter for (dealing with a single factor) also apply to experiments with additional factors (e.g., 2- or 3-factors), and therefore apply to later chapters as well. 

We provide advice on power analysis for 1-Factor experiments from two perspectives.  First, we illustrate power analysis using the standard software, G*Power (see Chapter, ‘Power Analysis’ for G*Power tutorial materials).  While this approach is useful (and is the dominant approach to power analysis), it only indicates an experiment’s power to detect an overall effect; not the power to detect differences between specific treatment levels (i.e., at the level of ‘post-hoc tests’; see below).  Hence, a power analysis using standard software may lead a researcher to design an experiment with high power to detect a factor’s overall effect, but only moderate or low power to detect differences between levels of the factor. 

Many researchers will find this unsatisfying because they specifically aim to understand differences between levels.  Therefore, we illustrate how to use simulations as a more flexible, alternative approach to power analysis for 1-Factor experiments.  For example, we use simulations for power analysis at the level of post-hoc tests, and also to determine an experiment’s power of to estimate effect sizes with desired precision.

You will encounter two terms for the analyses in this chapter: 1-factor glm and 1-factor ANOVA (i.e., 1-way ANalysis Of VAriance).  Do these methods differ?  GLM’s provides a general framework to analyze many types of experimental design. It turns out, however, that 1-factor ANOVA represents a ‘special case’ of 1-factor glm.  In other words, 1-factor ANOVA can be thought of as a subset of the general approach of general linear models.   This is because, for normally distributed data, 1-factor GLM works out to be the same as ANOVA.

This chapter explains how 1-way ANOVA works in detail.  Given that 1-factor ANOVA is a sub-set of 1-factor glm, you might wonder why I explain how ANOVA  works but not how glm works (not yet, in any case).  The reason is personal and historical:  I learned how ANOVA works before I learned how glm’s work.  Therefore, in my mind, it seems intuitive to understand ANOVA before glm.  But, please remember that other teachers likely have alternative perspectives; one day I will add an explanation of how glms work to this website.

I explain how 1-way ANOVA works for two reasons.  The first is to make students more comfortable with data analysis.  My sense is that many (certainly not all) students experience some apprehension with respect to data analysis.  Therefore, I teach how ANOVA works to make statistics feel more ‘accessible’ to everyone.  Specifically, you will see that we can understand ANOVA in terms of simple mathematics (addition, multiplication, etc.) and ideas.  

I hope that the explanation of how ANOVA works will remove the ‘mystery’ behind the test and make students feel at ease with this analysis.  Second, we teach how ANOVA works to help students understand the output from an analysis.  Many statistical software packages provide results from 1-factor ANOVA in the form of an ‘ANOVA table’ (R can present results from a glm as an ANOVA table, too), which includes obscure terms like, ‘sum of square’ and ‘mean square’.  You will learn what these terms mean when you learn how ANOVA works. 

Please note that you do not need to understand how ANOVA works to use the test responsibly:  responsible use of 1-factor ANOVA (glm) comes from understanding aspects of experimental design and the assumptions that underlie the tests, not from understanding the mathematics of the approach.  That said, understanding the mathematics (which we address very lightly) can only improve your understanding of data analysis, so we encourage you to pursue this level of understanding.

Analysis of 1-factor glms often involves two stages.  The first provides a p-value to assess evidence for an overall effect of the factor on the data (i.e., the y-variable, or ‘dependent’ variable).  This first stage, however, provides no information regarding differences among levels of the factor, such as effect sizes and p-values; ‘post-hoc’ tests at the second stage provide these latter insights.  By default, our analyses at this second stage will examine all possible comparisons among levels. 

For example, if a factor had three levels (A, B and C), our post-hoc test would provide ‘contrasts’ of ‘A vs. B’, ‘A vs. C’ and ‘B vs. C’.  We perform such contrasts (i.e., comparisons) using R’s emmeans library.  Please note that we provide a pdf file that walks through our analysis of the ChickWeight data. This pdf walks through the techniques covered in the accompanying video, and also demonstrates how to perform post-hoc tests without correcting for multiple comparisons.

We will illustrate additional approaches for post-hoc tests (‘contrasts’) in the future; the options are as diverse as they experiments you design and hypotheses you test.  For example, R’s emmeans library includes the option "trt.vs.ctrl", which allows comparisons of several treatments vs. a Control group.

View more details and options in this link to emmeans library  

Custom contrasts among treatment levels can also test focal hypotheses.  For example, imagine a 1-factor experiment with three levels (say, A, B and C).  A researcher’s hypotheses might require two contrasts:  B vs. C, and A vs. the average of B and C.  We will illustrate such custom contrasts in the future.

This chapter does something unusual:  we briefly discuss the entwined history of eugenics and the field of statistics.  We do this because I am aware that students elsewhere have objected to learning about some statistical topics or people in the history of statistics due to a historical connection to eugenics.  Hence, I present a video to clarify my perspective on a history of eugenics with respect to teaching experimental design and analysis.

The following document provides i) a ‘cheat sheet’ for commands often used in analyses of 1-Factor glm; ii) instructions on how to plot data from such experiments by simultaneously displaying means and SE’s (or 95% CI’s) and individual measurements.

Document
experimental data doc 1 (245.51 KB / PDF)

 

The attached Powerpoint presentation provides questions that review basic concepts from this chapter.  Note that the questions sometimes present more than one correct answer, and sometimes all the options are incorrect!  The point of these questions is to get you to think and to reinforce basic concepts from the videos.  You can find the answers to the questions in the ‘notes’ section beneath each slide.

Download practice problems and answers here.

Here are more practice problems with answers I often provide in a workshop on 1-Factor glm.


Document