Okay, in this video, we're going to talk about how an analysis of variance or one fact, the general linear model can calculate a P value. So at the end of our previous video, we got to the point where we could go through a series of calculations to determine our test statistic, which we learned is called an F value. What we're going to talk about in this video is how we can determine a P value for that particular f statistic. Before we do that though, we're going to work our way through this slide. Because this slide provides a really nice summary of everything we've talked about in the previous videos on analysis of variance. And I'd really like to do this probably just to emphasize how much you've learned and how what you've learned will help you understand the output that you get when you perform a one factor general linear model. So when we perform an analysis of variance, what the computer is doing for us is it's splitting the variance or the variation in our data set into two types. It's determining the amount of variation we have both among and within rock types. And those things are sums together for one, that estimate of variation. And then we're getting an estimate of variation within rock types alone. Okay? And what we do then is we look at a ratio of those two measures of variation or variants. And we ask whether or not this ratio is greater than one. We focus on whether it's greater than one, because if there was no among rock variation, then we're left with a ratio of within rock type variation divided by within rock type variation, which is one thing divided by itself, which is equal to one. So if this ratio is greater than one, then that gives us an indication that there is some variation among our various types, among various groups. So that's what we've talked about so far, kind of how to get to this ratio. What I'd like to do now is connect everything we've talked about to the output that you will get in many different software packages that do one fact, general linear models for you. For most statistical packages, you will get output for analysis like this. It looks like this. This is called an ANOVA table. Or what you can see as we have with us first row is just or labels of what's in our various columns. And in this next, the next two rows are just dividing our information up into whether we're looking at among rock type effects or within rock type effects, or among group versus within group. And what you can see here. Is we have a number of bits of information that all tabulated. We have the sum of squares, the degrees of freedom, the Mean Square, and we have our F value. Okay? Two videos ago, we talked about how you could calculate sums of squares separately for the among rock type treatment and for the within rock type treatment. Those values that we calculated would go in these two spots. One video ago, we talked about how we can calculate degrees of freedom for our among rock type variation and our within rock type variation. That's what we have here. Our among rock type variation, our, sorry, our degrees of freedom for among rock types is just equal to r number of treatments minus one. In this case we had three treatments and so this degrees of freedom is equal to two. And we also talked about how the within rocks height variation will have n minus three degrees of freedom, where n is our total sample size. And three just happens to be three because we had three different treatments. If we instead had four different types of rock and our experiments, we had four treatments. We would say this is equal to and minus four. And up here this would be four minus one. And also in the previous video, we talked about how to calculate the mean square, which is where we just take this sum of square that we have and divide it by the degrees of freedom. That gives us our first Mean Square, which is our Mean Square for among our groups. And to calculate our mean square with in-groups, we take our sums of squares for within groups and divide that by the degrees of freedom within groups to give us a vat, mean square. Finally, these two values of mean square, mean square two, these are our two estimates of variance. And if we take one and divide it by the others, we take the among group variance and divide that by the within group variance, which is what we have listed here. This gives us our value F. The summarizes everything we've talked about so far. So now. Pardon me. Now, when you look at the output on an ANOVA table, you have the knowledge that's required to understand or every single one of these numbers came from. Nice. Okay? So the last thing we need to understand is where our p-value will come from, because that's the last term that we will get in an ANOVA table. So where does the p-value come from? Well, to calculate a p-value, I'll just remind you that what we do or what the computer does for us is we take our test statistic and that test statistic is compared against an appropriate no distribution. The know the shape of a null distribution for analysis of variance depends on two things. It depends on the degrees of freedom for among groups and the degrees of freedom within groups. In other words, these two values, these two degrees of freedom for our among treatment and within treatment variance. Those two degrees of freedom together determine the shape of the null distribution that our F value is compared against. So here's an example. This is just one example of an infinite number of F-distributions. An F distribution is just the name of this null distribution. And this particular shape was determined by those two values of our degrees of freedom. Ok? So that's where this particular null distribution came from. If we'd used in a different experiments that have difference degrees of freedom. So let's imagine that had four treatments instead of three. This null distribution would have a different shape from this one. Assuming that I have taken this one from a case where we have three degrees of freedom for within, sorry for are among group variance. Ok? The point is, just to reiterate, the shape of this null distribution is determined by the combination of our degrees of freedom for within vs. among variance. So how do we calculate our p value? Well, we just take our F-statistic. Let's imagine that our F-statistic happen to be equal to 3.75, which would lie somewhere about there. What the computer does for us is it makes this comparison in a calculates the area under the curve, under the area under the null distribution. That's more extreme than this test statistic. And that gives us our p-value. And that will appear there. And with that knowledge and hand, you now have the understanding that you need to be able to interpret every single component of this output. You now understand the basic mechanics of every single calculation that will go into calculating every number in this output. Nice job. Okay? So at this stage, what we would do is we'd use our p-value to make a decision. Ok? And what I'm going to describe now as a next step is the standard convention which has existed for a very long time. Okay? We see that our rock type or null-hypothesis is that rock type does not affect growth rate. Another way of expressing this is to say that the, our null hypothesis is that the mean growth rate on schist is equal to the mean growth rate on granite and is equal to the mean growth rate on sandstone, i believe it was. Ok, so there's two different ways of formulating this null hypothesis for analysis of variance. One way is to simply say the rocks hype or our treatment effects do not influence growth rate. Another way is to say that our null hypothesis is that. The mean of whatever it is we're measuring. So growth rate in this case is the same for among each of our three treatments, okay? By convention, and we say that if our p-value that we got there is less than 0.05, then we reject our null hypothesis. We say, then we have sufficient evidence to say that there is, we're going to conclude that there are differences in growth rate among our rock types. In other words, we say that rocks hype is, has a statistically significant effect. In other videos, you will see that that type of thinking is problematic. It's a type of thinking that has existed for a very long time. And the field is now shifting away from that type of thinking. I still keep this type of thinking available within my teaching because a transition to a new type of thinking can't happen like that. For one thing, we're all going to be reading papers that were produced some years ago. And those previously published papers are still going to have this old way of thinking where we just simply decide whether or not something is statistically significant or not. And so for that reason, i do continue to teach this convention because I want all of my students to be familiar with that convention. And what I'm trying to say at this point in this video is that by convention, what we would normally do is he would say that if our p-value that we got there is less than 0.05, we would say that we have a statistically significant effect of rock type, okay? And we would, we would reject our null hypothesis. Okay? And a little bit though you're going to see how we can expand upon that fairly limited perspective on analysis, which I'll get to in just a moment. But at this point, what I want to say is by convention, what we'd normally say at this point. As it, if our p-value is less than 0.05, by convention, we would say that we reject a null hypothesis. And we would say that we have sufficient evidence to say that rock type does seem to influence growth rate. That doesn't tell us however, which rock types differ from which. For example, it could be that schist and granite have are that the growth rate on shifting grant are completely similar. There's no difference between them. Whereas sandstone might strongly influence growth rate. We can't gain that type of insight from this p value that we get here. Okay? In order to get that type of insight, insight into which treatments differ from which. And in order to do that, we need to go on to a next stage of analysis. So once we've got this p-value and we've got this table. We've basically finished the first stage of our analysis. Our second stage will involve moving on to what are called post hoc tests, which we use to better understand the differences between our groups or between our treatments. So for example, we can get p-values to give us a sense of our certainty about differences between the means of our various groups or the differences between means. There are various groups of treatments. But also really importantly, we can look at effect sizes. And it's really this focus on effect sizes that I think and many people think provides much more insight into our data than to simply use this dichotomous decision of significant or not. Okay? But that perspective on effect sizes really isn't the main point that I want to be making here. Really the main point that I want to make is that once we have this P-value, here, we've finished our first stage, the analysis. This P-value by convention tells us whether or not we want to reject the null hypothesis. What will typically go on to do is to perform post hoc tests in the second stage to understand effect sizes and they get further p-values for specific comparisons among our groups. I am not going to create a specific video that focuses specifically on these post hoc tests. The way in which I'm going to teach these post-hoc tests is by working through a specific example or several examples of this type of analysis. So if you're looking for a specific video on post hoc tests, at the moment, I haven't created one. I'm expecting most of what you will learn. You'll learn on the fly, you our analyses of data. Okay. So we're going to stop this are video there. And just to say that few are last series of videos. You've learned the guts of how Analysis of Variance works or one factor, general linear models. What we're going to go on to now in future videos is to talk about the assumptions that are associated with these tests. And to talk about how we can test those assumptions. And then to actually go and do some of these analyses where we will walk through some analyses together to show exactly how we can put these principles into practice. I hope this video has been helpful and I'll say, thank you very much.