From this notebook, we will be dealing with hypothesis testing.

In one sample t-test, we are testing:

The code to do this is:

t.test(data, mu = )

Here, data is the vector that contains the dataset and mu is the parameter of interest.

Example:

The global mean height of adult men is 171cm. We will examine if the mean height of adult men in the U.S. is different compared to the global mean.

df <- read.csv('Data/NHANES.csv')
# Get male data
df <- df[df$Gender == 'male',]
# Get adult data
df <- df[df$Age >= 20,]
head(df)

Plot data:

hist(df$Height, main='Height in the U.S.', xlab='Height (cm)', breaks=20, col='cyan')

Conduct one sample t-test:

t.test(df$Height, mu = 171)
## 
##  One Sample t-test
## 
## data:  df$Height
## t = 38.014, df = 3523, p-value < 2.2e-16
## alternative hypothesis: true mean is not equal to 171
## 95 percent confidence interval:
##  175.5423 176.0363
## sample estimates:
## mean of x 
##  175.7893

Since the p-value is less than 0.05, we can conclude that the mean height of adult men in the U.S. is greater than 171cm. Also, the 95% confidence interval of mean height is [175.54, 176.04] from the output of the code.

©2021 by Daiki Tagami. All rights reserved.