In this post I am going to show you how to test your hypothesis by using the null hypothesis in R, that is how scientists can act like skeptics when analising data.

In this post I am explaining how to perform a hypothesis test using R, and we are also going to understand what a p-value is.
In this built in R data set (remember to call tidyverse)
called ToothGrowth we can find a total of 60 observations
(rows) divided into 3 variables (columns):
This data was obtained via a research, which I’ve explained about with more details in this post.
Generally it demonstrates the effect of applied vitamin C in odontoblasts (cells responsible for teeth growing) from guinea pigs.
ggplot(ToothGrowth) +
geom_freqpoly(aes(x = len, color = dose), binwidth = 1) +
theme_classic()
ggplot(ToothGrowth) + geom_freqpoly(aes(x = len, color = dose), binwidth = 1) + theme_classic()

It seems like there is a clear influence of the vitamin dose applied in the guinea pigs and their odontoblasts sizes, but we need to be skepticals and obtain some evidences over what are we claiming before taking this hypothesis under consideration.
To start understanding the relation of the vitamin doses and tooth
growth, we can observe and compare the means of the groups using
summarise() from dplyr package.
ToothGrowth %>%
group_by(dose) %>%
summarise(mean = mean(len))
# A tibble: 3 × 2
dose mean
<fct> <dbl>
1 0.5 10.6
2 1 19.7
3 2 26.1
It seems like there is a difference between the mean of each dose of vitamin applied, let’s also analyse the difference between means:
observable_diff1 <- 19.735 - 10.605
observable_diff2 <- 26.100 - 19.735
observable_diff3 <- 26.100 - 10.605
Until this point we have already answered some questions, like:
#Another form of obtaining variation per dose
dose_05 <- ToothGrowth %>%
filter(dose == 0.5)
dose_1 <- ToothGrowth %>%
filter(dose == 1)
dose_2 <- ToothGrowth %>%
filter(dose == 2)
We can check if this information is TRUE orFALSE` with
the following code:
#Inside of observable_diff1 we have the difference between the 0.5 and 1 doses
mean(dose_1$len) - mean(dose_05$len) == observable_diff1
[1] TRUE
ToothGrowth %>%
group_by(dose) %>%
summarise(min = min(len), max = max(len))
# A tibble: 3 × 3
dose min max
<fct> <dbl> <dbl>
1 0.5 4.2 21.5
2 1 13.6 27.3
3 2 18.5 33.9
So this is it, it really seems we could think that the dose of Vitamin C has the direct effect over teeth growth, well actually not… what really happens is that if we repeat the experiment with different rat groups, we would probably have obtained a different answer from the one we obtained previously.
For this reason this variables are known as
random variables, because they can take several different
values.
That means, every time we repeat the experiment, we will
have different means to the groups we have access to (dose), and by
being so the variable len is considered
random.
Let’s have a better comprehension about
random variables.
Let’s understand how the means change when we collect different samples from our data set:
control <- ToothGrowth$len %>% sample(12)
We can do this repeatedly to build a completely random distribution based on our main data (mainly because we don’t have the means to repeat the real experiment, so we model what would be the population without doses of vitamin C being applied to them).
Please notice how the mean change in each sample, repeat
the command of sampling in your R using sample().
mean_control <- ToothGrowth$len %>% sample(12) %>% mean()
print(media_controle)
mean_control <- ToothGrowth$len %>% sample(12) %>% mean()
print(mean_control)
[1] 15.40833
[1] 16.31667
Notice how the mean varies in each measurement, we can do this many times to learn even more about the distribution of this random variable.
As previously said we have to be skeptical, for that we will assume that randomly the observed difference could not mean much, it could be something that happened by chance.
To obtain evidences and minimize this uncertainty, we will use what’s
called the Null hypothesis.
Let’s analyse again the observed differences from before. Since we need to be skeptical, we have to question ourselves:
In this moment we are working with what is called the skeptic
mindset, or the Null hypothesis, which name comes to remind
us that we are acting as skeptical: we give credit to the possibility
that our observations could just have happened by chance.
Null, because we are questioning what would happen to the observable difference in case there were no application in the experiment, that is in case nothing have happened, what would have happened?
We can check, as many times as we would like, the differences between teeth sizes without comparing doses. We can do that by sampling a random number of rats from our original data set, then we can record the value of the differences between means of two random groups, this script comes from prof. Rafa, in the book “Data analysis for the life sciences”:
mean_control <- ToothGrowth$len %>% sample(12) %>% mean()
mean_treatment <- ToothGrowth$len %>% sample(12) %>% mean()
print(mean_treatment - mean_control)
It’s also possible to do this many times using loops in R:
n <- 10000
null <- vector("numeric",n) #first we create a vector with 10000 spaces
for (i in 1:n) { #looping to generate a null vector
mean_control <- ToothGrowth$len %>% sample(12) %>% mean()
mean_treatment <- ToothGrowth$len %>% sample(12) %>% mean()
null[i] <- mean(mean_treatment) - mean(mean_control)
}
print(null)
The values in null are what we call a null
distribution.
What percentage of null is equal or higher than the
observed differences?
mean(null >= observable_diff1)
mean(null >= observable_diff2)
mean(null >= observable_diff3)
[1] 2e-04
[1] 0.0109
[1] 0
Only a minority percentage within our 10.000 simulations are equal or
higher than the observed differences in the null hypothesis. As skeptics
what do we conclude? That when there’s no dose effect, we can observe a
difference as big as the one observed before only 0,0001% of the times
(in maximum, as could see in observable_diff1). This value
is known as p-value.
We performed a simulation to build the null data set,
let’s now observe the the null distribution:
hist(null, freq=TRUE)

We can also see where are the values as high as our observations:
hist(null, freq=TRUE)
abline(v=observable_diff1, col="red", lwd=2)
abline(v=observable_diff2, col="blue", lwd=2)
abline(v=observable_diff3, col="yellow", lwd=2)

Values higher than those ones we have observed are relatively rare. This gives us evidence to support our hypothesis, that the amount of dose of vitamin c applied in guinea pigs affects the size of odontoblasts.
It was possible to observe that the top difference, between the doses 0.5 and 2, didn’t even appeared in our null distribution plot, showing us that the observable difference couldn’t have happened entirely randomly, and that the null hypothesis can be overthrown.
But even so we should always be careful!
To create a control group with only one dose, we can filter the data set:
ToothGrowth_dot5 <- ToothGrowth %>%
filter(dose == 0.5)
Now we can create our data sets to build the null hypothesis:
control_ToothGrowth_dot5 <- ToothGrowth_dot5$len %>% sample(12) %>% mean()
treatment_ToothGrowth_dot5 <- ToothGrowth_dot5$len %>% sample(12) %>% mean()
print(treatment_ToothGrowth_dot5 - control_ToothGrowth_dot5)
[1] 0.9333333
Here we have repeated 10.000 times the differences between means to create a null distribution:
n <- 10000
null_dose <- vector("numeric",n)
for (i in 1:n) {
controle_ToothGrowth_dot5 <- ToothGrowth_dot5$len %>% sample(12) %>% mean()
tratamento_ToothGrowth_dot5 <- ToothGrowth_dot5$len %>% sample(12) %>% mean()
null_dose[i] <- mean(tratamento_ToothGrowth_dot5) - mean(controle_ToothGrowth_dot5)
}
print(null)
hist(null_dose, freq=TRUE)

Another very interesting way to compare our observed difference with
the null hypothesis is by using the infer package.
The infer package helps us to test hypothesis in a
pretty fast way, install the package and load it:
install.packages("infer")
library(infer)
To calculate the observed statistic in our data set:
F_hat <- ToothGrowth %>%
specify(len ~ dose) %>%
calculate(stat = "F")
Generating a null distribution:
null_dist <- ToothGrowth %>%
specify(len ~ dose) %>%
hypothesize(null = "independence") %>%
generate(reps = 1000, type = "permute") %>%
calculate(stat = "F")
Visualising the observable difference with the calculated null distribution:
visualize(null_dist) +
shade_p_value(obs_stat = F_hat, direction = "greater")

We are going to further analyse a data set with SmartPhones from India in the next post, see you!