Okay, In our previous video, we considered a scenario where a biologist might start collecting some data for an experiment. Stopped to analyze those data. They'd become aware of what the p-value is. And then after learning what that p-value is, they would continue to collect some more data. And I mentioned in that previous video that, that kind of practice tends to increase the rate of false positives, that it increases the type one error rate. What we're gonna do in this video is we're going to talk about why that is. And I'm going to use some simulations in order to demonstrate why the false positive rate increases under that practice. I'll give you a kind of a heads up before we get deep into this video. And that is the explanation for why this practice increases the rate of false positives. That explanation just appears the very end of this video. And most of the video is going to involve my explaining how the simulations work. I should say that learning how the simulations work is not the point of this video. Point you, this video is to understand why this practice of collecting data, analyzing it, and then continuing to collect more can increase the rate of false positives that's going to come at the end. So why am I going to spend time explaining how the simulations work? Basically, it comes from my own personal bias and my own experience. I really don't like it when someone might provide a figure, either that was produced from a simulation or even mathematical theory or from an experiment without explaining how that figure was generated. And so the reason why we're going to walk through this simulation in the first place is just to remove the magic from the figure that you're going to see at the end, because that's what we're going to produce. The simulation is going to produce a figure that will illustrate a particular scenario. And we're going to discuss that figure in order to understand the issues. So if you want to skip the first part where I'm explaining how to perform the simulations in the first place. You can do that really easily. What you can do is just keep skipping ahead until you see a figure up here in the bottom right-hand panel of the screen. Okay? So what are we going to simulate? We are going to simulate a process where an imaginary scientists, that imaginary biologist is going to collect some data, analyze those data, and we're going to keep that p-value. In fact, we're going to store it somewhere so you can look at it again later. And then they're going to continue to collect more data. They're going to collect more data and add those new data to the original data. And then they'll analyze that whole larger dataset again. And they'll collect that. They'll see the p-value will store that p-value will keep it again if you look at it later on. And The scientists is going to continue by clicking more data and adding those new data to the two previous data sets that were collected. So now we have an even larger dataset and we'll analyze those data, collect the p-value, et cetera. And we're just going to continue that process. Few a 100 steps of collect, analyze, repeat. Okay, So we're going to repeat that process 100 times. And at the end, we're going to produce a figure that will show us how the p-values have changed through time throughout that process of collection data, analyzing, more data, analyzing. Okay, so we're gonna go, we're going to, we're going to simulate this process of collecting data and analyzing it 100 times. Some people might call this one simulation with a 100 different steps. That's just kind of semantics. We're going to start is just by creating a counter or creating a variable that will allow us to specify the number of times that we want the scientists to collect data and analyze it. And I'm just calling that end Sims and say it's equal to 100. The next thing that we're going to need, since we're going to be collecting data so many times, is we need a place to store those data. And so here these two lines of code create vectors, which one, which are called Treatment a and Treatment B. And this notation on the right where we have C and then no within. This allows us to create a vector that has nothing in it. Okay, We're basically just creating the name of a vector where the vector is empty. So we can say run or show you what these look like. So treatment a, you can see it's empty and sheep and B is empty. And you'll see later on how we can fill them up. So we've now created a place to store our data that's going to be accumulated over time. We also need a place to store the p-values that we're going to accumulate from our many analyses. And so I'm creating an array here, which I've called p vowels. And this is going to store the p-values from our many analyses. So when I'm creating an array, I've said the dimensions equals and Sims, which was a 100. You'll remember what I'm essentially doing. One way of thinking about this in very cartoon like terms is to imagine a large egg carton. So a typical egg carton might have just six places to hold eggs, or maybe 12. What I want you to imagine this Arabian like is like an egg carton, which is empty. Basically it, it's, it's a, a device where there are places to hold lots of individual observations. Where each of those observations, when we first create it, sorry, where each of this place holders are empty when we first created. Okay? And so we're creating an array that has space to hold 100 bits of information. So we'll run this and I'll show you what it looks like. So you can see we have 100 spaces in our array and they're currently all populated with the values and a, which means missing values. Okay. Now we're going to do is we are going to start to collect our data. I'm going to skip down a little bit. I'm going to skip this line for now. We're going to come back to explain what this line does. And a little bit. What I'm going to focus on now is this part of this next line where I'm using the function r norm. If you've seen our previous videos on what standard errors are, then you'll have seen the rnorm function and action in that context. What the rnorm function allows us to do is to select at random numbers S or the R is from a normal distribution. And our norm takes three arguments. The first argument here where we have a five, excuse me, the first argument here where we have a five, this specifies the number of random numbers that we want to draw from a normal distribution. These next few arguments specify the qualities of that normal distribution that the numbers are being drawn from. So this middle value here we have 0 that specifies the mean value of the, of our normal distribution. So our normal distribution will have a mean value of 0. And this last number specifies the standard deviation of the normal distribution. In other words, in very general terms at specifies how wide the normal distribution will be. I have this bit I just highlighted now. So if I just say run, you can see what this function returns. So we now have five random numbers. If I run it again, we get five more and more and more and you'll see it every time. They get five new random numbers drawn from a normal distribution with a mean of 0 and a standard deviation of 10. So that's how we're going to pop. That's where we're gonna get our data from. We're going to imagine that when the scientists is collecting data, they are collecting data that are normally distributed that happened to have a particular mean at a peak of that particular standard deviation, which he specified here. Okay, so what else do we have in this line of code? You'll see right beside our norm, I've listed this vector that we created to that at the moment is empty. And this is where we said we're going to keep our data for treatment a. And so we have that vector and r norm sitting side-by-side. And you'll notice that these two parts, the vector and r norm, are both within some brackets for the function append. What's happening here? What's happening here is the append function will take the numbers that were produced by our norm and append them or add them to the vector that we've listed first. Well, we have treatment a and that output is going to be saved. A vector with the exact same name as we started with. So overall, what we're doing here in this line of code is we're taking our original vector, which at the moment is empty. We're drawing five random numbers. And we're adding those random numbers to the end of this vector. And then we're saving that output and an object with the same name as the original vector. So in essence, what we're doing is we're just taking our five new data points and adding them onto the original vector. In essence, that's what, that's one way of thinking about this. So let's see what this looks like. We'll run that. Now. We'll say Treatment a. And you can see now Treatment a is no longer empty. Now treatment a has five random numbers within it. For this coat, for this line down here, we're going to do the exact same thing except for treatment B. There's something I really want to highlight. However, have a look at the what ifs. Have a look at the way I specified the rnorm function for treatment B and compare that to what we have for treatment a. What I want you to notice is that these, these values are specified identically. In both cases we're drawing five random numbers. And especially important, the distributions that we're drawing those numbers from our identical for treatment a and treatment B. So both distributions have a mean of 0 and a standard deviation of ten. This means is that we're creating a situation where we know for certain that the null hypothesis is true. In other words, we're creating a situation where we know for certain that there truly is no difference between the data for treatment a and the data for treatment B, or at least for the population from which they came. What that means is that if we get a small p-value, a p-value that might lead us to think that we have evidence to reject the null hypothesis. Doing so would be incorrect. We will be making, we be making a false conclusion and we would have a false positive. Okay? In other words, because we've set the distributions to be identical for treatment a and treatment B. And we've made sure that there is truly no difference between treatment a and treatment B. This means that anytime you get a small p-value in our simulations, that is of a false positive. Okay, So that's a really important feature that I want you to remember. I'll say it again, however. So let's just run this line of code and see we get, so Treatment B can see you've got those. It now has its own five numbers. What we'd like to do now is compare the data between treatment a and treatment B. And so that happens in the next line of code where I'm running a t-test. And so he said t-test and then I've just listed our two vectors. If you've watched the previous videos on how to perform t-tests, you'll have seen. And those videos that I provided a slightly different way of running a t-test. Specifically in those previous videos, I had a formula where I specify the data or the column that contain the data on the left. And then we had a tilde are a little squiggly line. And then we specified another variable or another name of a column which indicated which group each data point came from. Did it come from treatment a, achievement b. What does just want to point out here is that there's more than one way of submitting data to perform a t-test. And so if this looks unfamiliar to you, then that's may just be because you may be unaware that there's more than one way of running a t-test. And, and I'm using this this approach in this video just because it's a little bit simpler given the types, given yet a code that we're producing. So let's run our t-test. That was our goal. I got a slightly off track there for a moment. So here is our t-test. And you can see that we've run or R as R1, a Welch's two sample t-test, which is perfect for our purposes. And you can see here we have our p-value. Whoops, okay, which is equal to 0.8017. What we'd like to do is you'd like to pull out that p-value from this overall output. And we can do that by adding a dollar sign to the output. And what that allows us to do is to specify one type of output that we'd like to identify. And the type of output that we want to obtain is the P-value, which is specified by P dot value. So if I just highlight this code, you'll see that now we can pull out only the p-value, which is pretty cool. We want to store that p-value somewhere. Now, we want to store it in this vector, which are this array which you've called p vowels. And since this is a first p-value that we've created, we want to store in the first location. You can see that I've added the square brackets after the array. That means that we, with these square brackets, we can go into the guts of our array. And you can see we've got this pointer I, which indicates the position that this p-value would go to within this array. And we want to go in the first position. So I'm just going to say I equals 1. And then we can submit this line of code. And we can look at p vowels. And now you can see we still have lots of empty spaces that are waiting to be filled, but we have our first p-value left there. Okay? Now we want to do is we want to repeat this process. So actually before we do that, I want to remind you what's treatment a looks like? A treatment a looks like. So there's the original five data points and treatment a. If you run this again, can look at treatment a. And you can see that the original five data points are still there. All we've done is we've added five new ones, 12345. And we can do the same thing for treatment B. We're not gonna bother looking at those. And now we can run a t-test for those larger datasets. And we're going to store that p-value. But this time we want to store the p-value in the second position of our array. So I'm gonna say I equals two. And then we'll run this code again. And then we'll look at p vowels. And you can see now we have our second P-value stored. So that's pretty straightforward, except for the fact that it will be a complete pain in the but if I have to keep doing all this by hand. So what we'd like is a way to automate this process. And so to do that, I've decided to use the a for loop. And a for loop has two basic parts. I've highlighted the first part here, which is where we essentially create a counter, which I'll explain in a moment. The second part involves creating some open curly brackets. So we've opened curly brackets there, and we've closed our curly brackets down there. And the second part of this for loop, it's just basically all the code we've already looked at. And what a for loop does is it will simply repeat it. It will continually to run the code that's within the curly brackets. Many times or as many times as a specified by this counter that we have at the beginning. Okay? So I'm just going to explain how this counter works. We've created a variable, oops, we've created a variable I. So this is a new variable that we've created. And with with this code, what we're doing is we're going to be setting I to be equal to a series of integers where I is first of all going to equal one. Okay? So when first I is equal to 1, this code will run with I being equal to one. And then I will count up to the next integer will be equal to two. Then this code will run with I being equal to two, and then it will count up to three. And this will run with IBM three and so on. And this process will continue all the way up until when I is equal to n Sims, which is a 100 in our case, if you remember. And it's when I is equal to n Sims, That's the last time that this code will run. So by setting up this for-loop in this way, we've been able to run this code within the curly brackets 100 times. There's also a really nice, neat little feature here, which is that remember I kept having to reset the value of I when I was running this by hand. You'll notice that since I keeps ticking through different values by including an eye as part of this for loop. That is going to mean that each time you run this code, our p-value is going to be put into the appropriate location in our array. This is a very common trick and it's very useful. So that's basically how simulation works. What I'm going to show you now is how we're going to plot our data. So we can actually come to, come to the conclusions that we are that we so desperately want. Okay? So I'm just going to plot the data using the plot function. And what I have on the left of this squiggly lines, what I'm gonna be plotted on the y-axis is just the p-values that we've been storing in our array p vowels. And remember these p-values have been stored in sequential order. And so what we want to do is we want to plot these p-values in sequential order. And so to do that, we're just going to specify x-axis as just being a sequence of numbers from one to a 100. That's it, This code of sequence or seek, one colon end Sims does. To show you how that works. So if I just highlight that, you can see we get a sequence of 100 numbers. So this, this function here will work while we're running plot, is going to plot our 100 P values, where each p-value is going to be paired with its appropriate sequence number at which this, this particular p-value was generated. The rest of this code is just things to make the plot look a bit nicer. I'm, I've said type equals b, which just means that the plot function will plot both. That's the B stands for, we'll pop, will plot both dots and lines that join the dots. And I've also set the y-axis to range between 01 because that's the full range of possible values for a p-value. The last line of our code here uses the function AB line, which just allows us to plot a straight line. On, on a figure. And really AB line only has to take two arguments. One is the y intercept and the other is the slope. Excuse me. And I've specified the y intercept as being 0.05 and the slope is being 0. Since the slope is 0, that means we're going to plot a horizontal line that is going to have a height of 0.05. That'll be a handy thing for us because by keeping that line there, we'll be able to know when our p-values that are the results of our simulations will be able to know when they are below or above a value 0.05. This last option here is just to make this line look a little bit nicer. I've just said line type or LTE y is equal to 3, which will produce a dotted line. And that's it. So we're now finally the point where we can actually reach some conclusions, which was really the point of this video, not explaining how this code works. Hopefully you've picked up a trick or two along the way. So let's run this code now and see we get, okay. So I just want to orient you to this plot to begin with. And so basically now just be repeating what he has previously said. Along the y-axis we have are a series of P-values, okay? And along the x-axis we basically have numbers from one to 100 which indicate the order in which these p-values are generated. So this was the first p-value generated, which was maybe 0.75, something like that. And then the next p-value was around 0.4. And the next one looks like it was just above 0.2. And then the p-value started getting higher again and lower and higher. Okay? So this figure really just summarizes the story of our p-values through the process of the scientists collecting data, analyzing it, collecting more data, analyzing it, collecting more data, analyzing it. Okay. What I'd really like you to notice here, and this is really perhaps the most important point to take from this video because appreciating this point allows you to appreciate the other points are going to make later on. The thing I really like you to appreciate is just how much the p-values just randomly wander from high values to low values, back up to high values when the null hypothesis is true. So when the null hypothesis is true, we fully expect. P-values will just wander all around from a low values to high values completely at random. Okay, that's an essential thing to understand or to appreciate. Okay? I'm just going to run this couple more times. We get to a particular scenario that I'd like to talk about. Okay? Here we go. Here's a scenario that we can discuss, okay? Because this provides us a scenario where here's a scenario that we're going to discuss that was brought to tie myself and not trying to say too much. Let's imagine that we were the kind of scientist or the scientists that we've been talking about all along where we are a scientist that has experienced this scenario. Where we have collected some data, analyzed our data, and then decided to continue collecting data. When we started this video, I told you that our goal was to understand why that practice, What's tends to increase false positive rates. Okay, So here's a situation that I want to use to illustrate that process. Let's imagine ourselves as that scientists where we collected data up until it may be time point 40. Okay. And so we collected data up until this point, and we examined our data there. You can see that at that point our p-value is relatively large. And so at this point with a p-value of maybe 0.15, something like that. This would not be strong evidence to reject our null hypothesis. In some cases, for some scientists, they might decides to continue collecting data because they might think that they have not gotten a significant p-value yet. Because simply either experiment does, doesn't have enough power. And so they decide to collect more data to increase the power of their experiment. Okay? So the scientist has collected data up to this point, check the p-value, found that they were, they did not have strong evidence to reject the null hypothesis, say decide to collect more data. And so they collect more data maybe up until this point here, where now they analyze their data again. And this time they find the p-value is less than 0.05. And based on this, they decide that they now have sufficient evidence to reject the null hypothesis. We know for a fact that in this scenario, we know for a fact that a scientist who concludes at this point that they can reject the null hypothesis, we know for a fact that that conclusion will be a false positive. And we know that's true because we set up our data in such a way that we, that we know that there's no difference between. Treatment a and Treatment B. So really what's the underlying feature? What's the underlying cause of this kind of mistake? The underlined feature that's so important here is the fact that the scientist has given themselves multiple opportunities to analyze their data. I've mentioned in previous videos that the more times you analyze your data. Or I should say this is a slightly different way that I was going to say that the more times you analyze your data, that means the more times you give yourself an opportunity to identify a false positive. I wasn't going to use that use that wording earlier because I thought that that was just terrible grammar. Basically, that's what I want to say. Every time you run a statistical test, you open yourself to the possibility of finding a false positive. If you just run your test once, then that gives you a 5% chance of reaching a false positive if you're making your decisions based on traditional threshold of 0.05. Okay? We'll assume for the sake of this video at that, that's what you're doing. If you run another test, then you've given yourself another 5% chance of obtaining a false positive. If you run it again, you give yourself yet another 5% chance of obtaining a false positive. So the more times you analyze your data, the more opportunities there are to obtain false positives. In this scenario. We've only talked about the scientist analyzing their data twice, once at time 40 and once again later, but still twice as more than once. And when you analyze your data multiple times, that is going to tends to increase the number of opportunities for you to get a false positive. And so that's going to increase the rate of false positives that are going to be reported in the literature. So VAT is the real reason for why this practice tends to increase the false positive rate. What I'd like to do now is just kind of think of a reverse scenario. Let's imagine that a scientist had originally planned to collect data all the way up until time 100. Okay? Actually I'm going to find another scenario to illustrate, to illustrate this. Okay? I'll just keep pressing this until we get a scenario that matches the case that I want to talk about. All right, Here we go. Here's a case or we can talk about. Let's imagine that you are a scientist who had plans to collect data all the way up till this point. Okay. Partway through your experiment, you got curious. Okay, Let's say that arounds time 18 here, you have been collecting data for awhile and you wanted to know you want to get some sense of what your data looks like. And so you decide to analyze your data here. And lo and behold, you got a low p-value, which you felt gave you good evidence to reject the null hypothesis. At this point. As a scientist, you might incorrectly convince yourself that you probably got this small p-value because the effects that you observed in your experiment is really strong. And so you don't actually need as much data as you originally thought in order to reject the null hypothesis. Okay? That might be some thinking that would lead someone to believe that they have a, a, a reliable result this early on in their experiment. Okay, so what a scientist might do in that case is because the obtained a low p-value early on, they might say, Okay, I've got enough data now I don't need to continue running my experiment. Okay? And they just stopped our experiments and they end up concluding that the results, they end up concluding that they can reject the null hypothesis. Notice however, that if they had continued collection the data all the way until the time that they had originally planned, they would have reached a different conclusion. So what's the problem here? The problem here is that by the, when the scientists gave themselves an additional opportunity to look at their data by deciding to have just a sneak peak early, early on, gave themselves. They gave themselves more than one opportunity to analyze their data. And they give themselves more opportunities to analyze their data. They increase the probability of getting a false positive. Okay? So we can, so now we have these two complimentary scenarios. One where you analyze your data, check a p-value, and then continue to collect data. That scenario can increase the rate of false positives and the reverse scenario can as well. We decide to check your p-value earlier than you'd originally planned to. And yet you find a small p-value and so you decide to cut off your experiment at that point. The thing that both of those situations have in common is that in both of those cases, the scientist is analyzing there are the scientists gives themselves the opportunity to analyze our data more than once. And when they do that, that increases the opportunity to obtain false positives. I'm going to end the video there. It's been fairly long. Apologies for that, but hopefully this has been a useful video. And on that note, I'll say, thank you very much.