Readings

This assignment is based on the following readings:

YaRrr: 13

Assignment Goals

Conduct hypothesis tests such as t-tests, correlation tests and chi-square tests
Know how to access specific elements of hypothesis test objects (like p-values and test statistics)

Examples

library(yarrr) # Load yarrr for the pirates dataframe

# 1 sample t-test

# Are pirate ages different than 30 on average?
t.test(x = pirates$age, 
       mu = 30)

# 2 sample t-test

# Do females and males have different numbers of  tattoos?
t.test(formula = tattoos ~ sex,
       data = pirates, 
       subset = sex %in% c("male", "female"))

# Correlation test

# Is there a relationship between age and height?
 cor.test(formula = ~ age + height,
          data = pirates)

# Chi-Square test
 
# Is there a relationship between college and favorite pirate?
chisq.test(table(pirates$college,
                 pirates$favorite.pirate))

# Assign the results of a test to a new htest object and then acces specific outputs

age.ttest <- t.test(x = pirates$age, 
                   mu = 30)

class(age.ttest)    # sex.ttest is an htest object
names(age.ttest)      # What elements are in the object?

# Now you can access specific named elements with $

age.ttest$statistic   # The test statistic
age.ttest$p.value     # The p-value
age.ttest$conf.int    # Confidence interval for mean

Analyzing results from a survey

knitr::include_graphics(path = "http://assets.careerspot.com.au/files/news/mischevioussurvyepenguins.jpg")

In this WPA, you’ll analyze data from a fictional survey of 100 students (in fact, you can even see the code I used to generate the data here: code to generate wpa6 data). In the survey, students were asked a variety of demographic questions (e.g.; sex, age, and country of origin), behavioral self-report questions (e.g.; have you smoked marijuana?), and completed a few cognitive tasks (e.g.; completing a logic problem). Your task is to conduct several hypothesis tests on these data.

Data Description

The data are located in a tab-delimited text file at https://raw.githubusercontent.com/ndphillips/IntroductionR_Course/master/assignments/wpa/data/studentsurvey.txt. The data has 100 rows and 12 columns. Here are descriptions of the columns:

Name	Class	Description
`sex`	character	Sex of the participant. “m” = male, “f” = female.
`age`	integer	Age of the participant.
`major`	character	Major of the participant.
`haircolor`	character	Color of the participant’s hair
`iq`	integer	Score on an IQ (intelligence) test
`country`	character	Country of origin
`logic`	numeric	Amount of time it took the participant to complete a logic problem (smaller is better)
`siblings`	integer	Number of siblings the participant has
`multitasking`	integer	Participant’s score on a multitasking task (higher is better)
`partners`	integer	Number of sexual partners that the participant has had.
`marijuana`	binary	Whether or not the participant has ever smoked marijuana (0 = “never”, 1 = “at least once”)
`risk`	binary	Would the person play a gamble with a 50% chance of losing 20CHF and a 50% chance of earning 20CHF?(0 means the participant would not play the gamble, 1 means they would)

Table 1: Description of columns in the student survey data

Load the data into R and saving a local copy as a .txt file

Hint: Look at the answers to WPA #4 at https://ndphillips.github.io/IntroductionR_Course/assignments/wpa/wpa_4_answers.html if you get stuck

Open your class R project. Open a new script and enter your name, date, and the wpa number at the top. Save the script in the R folder in your project working directory as wpa_6_LastFirst.R, where Last and First are your last and first names.
The data are stored in a tab–delimited text file located at https://raw.githubusercontent.com/ndphillips/IntroductionR_Course/master/assignments/wpa/data/studentsurvey.txt. Using read.table() load this data into R as a new object called survey.

# Read tab-separated data from the web and store as a new object called survey

survey <- read.table(file = "https://raw.githubusercontent.com/ndphillips/IntroductionR_Course/master/assignments/wpa/data/studentsurvey.txt",            # Where is the file located?
                 header = TRUE,           # Is there a header row?
                 sep = "\t")            # How are columns separated?

Using head(), str(), and View() look at the dataset and make sure that it was loaded correctly. If the data don’t look correct (i.e; if there isn’t a header row and 100 rows and 12 columns), then something must have gone wrong when you loaded the data. Try and fix it!
Save a local copy of the data as a tab-separated text file called studentsurvey.txt to the data folder of your project using write.table().

# Write survey to a tab-delimited text file in my data folder.

write.table(survey,                        # Object to be written
            file = "data/studentsurvey.txt",        # File location and name
            sep = "\t")               # How should columns be separated?

How to report hypothesis tests

For the rest of the assignment you will be conducting, and reporting, several hypothesis tests. Please write your answers to all hypothesis test questions in proper American Pirate Association (APA) style! If your p-value is less than .01, just write p < .01.

Here is the basic structure of APA style for reporting hypothesis tests

Name	Skeleton	Example
Chi-square	X(_, N = _) = _, p = _	X(25, N = 100) = 15.89, p = 0.92
T-test	t(_) = _, p = __ (_-tailed), 95% CI = (_, __)	t(99) = 3.47, p < .01 (2-tailed), 95% CI = (0.14, 0.53)
Correlation test	r = _, t(_) = __, p = __ (_-tailed), 95% CI = (_, __)	r = 0.09, t(98) = 0.89, p = 0.37 (2-tailed), 95% CI = (-0.11, 0.28)

For example, here is some output with the appropriate apa conclusion:

# Do pirates with headbands have different numbers of tattoos than those
#  who do not wear headbands?

t.test(formula = tattoos ~ headband, 
       data = pirates,
       alternative = "two.sided",
       mu = 0,
       conf.level = .95)

## 
##  Welch Two Sample t-test
## 
## data:  tattoos by headband
## t = -19.313, df = 146.73, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -5.878101 -4.786803
## sample estimates:
##  mean in group no mean in group yes 
##          4.699115         10.031567

Answer: Pirates with headbands have significantly more tattoos on average than those who do not wear headbands: t(146.73) = -19.31, p < .01 (2-tailed), 95% CI = (-5.88, -4.79)

t-test

The average global IQ is known to be 100. Do the participants have an IQ different from the general population? Answer this with a one-sample t-test.

t.test(x = survey$iq,
       mu = 100)

## 
##  One Sample t-test
## 
## data:  survey$iq
## t = 20.847, df = 99, p-value < 2.2e-16
## alternative hypothesis: true mean is not equal to 100
## 95 percent confidence interval:
##  109.0935 111.0065
## sample estimates:
## mean of x 
##    110.05

I do find significant evidence that these participants have a mean IQ larger than the general population, t(99) = 20.85, p < .01, 95% CI = (109.09, 111.01)

A friend of yours claims that students have 2.5 siblings on average. Test this claim with a one-sample t-test.

t.test(x = survey$siblings,
       mu = 2.5)

## 
##  One Sample t-test
## 
## data:  survey$siblings
## t = -0.9083, df = 99, p-value = 0.3659
## alternative hypothesis: true mean is not equal to 2.5
## 95 percent confidence interval:
##  2.181545 2.618455
## sample estimates:
## mean of x 
##       2.4

I do not find significant evidence that these participants have an average number of siblings different from 2.5, t(99) = -0.91, p = 0.37, 95% CI = (2.18, 2.62)

Do students that have smoked marijuana have different IQ levels than those who have never smoked marijuana? Test this claim with a two-sample t-test (you can either use the vector or the formula notation for a t-test)

t.test(iq ~ marijuana, 
       data = survey)

## 
##  Welch Two Sample t-test
## 
## data:  iq by marijuana
## t = -0.72499, df = 96.078, p-value = 0.4702
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -2.610103  1.213545
## sample estimates:
## mean in group 0 mean in group 1 
##        109.6939        110.3922

I do not find significant evidence that the mean IQ of people who have ever smoked marijuana is different from those who have smoked marijuana, t(96.08) = -0.72, p = 0.47, 95% CI = (-2.61, 1.21)

correlation test

Do students with higher multitasking skills tend to have more romantic partners than those with lower multitasking skills? Test this with a correlation test:

cor.test(formula = ~ multitasking + partners,
       data = survey)

## 
##  Pearson's product-moment correlation
## 
## data:  multitasking and partners
## t = -0.79444, df = 98, p-value = 0.4289
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.2721353  0.1182836
## sample estimates:
##         cor 
## -0.07999296

I do not find significant evidence that there is a correlation between multitasking and the number of romantic partners one has had, t(98) = -0.79, p = 0.43, 95% CI = (-0.27, 0.12)

Do people with higher IQs perform faster on the logic test? Answer this question with a correlation test.

cor.test(formula = ~ iq + logic,
         data = survey)

## 
##  Pearson's product-moment correlation
## 
## data:  iq and logic
## t = 1.2065, df = 98, p-value = 0.2305
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.07727287  0.31003231
## sample estimates:
##       cor 
## 0.1209815

I do not find significant evidence that people with higher IQs perform faster on a logic test than those with lower IQs, t(98) = 1.21, p = 0.23, 95% CI = (-0.08, 0.31)

chi-square test

Are some majors more popular than others? Answer this question with a one-sample chi-square test.

chisq.test(table(survey$major))

## 
##  Chi-squared test for given probabilities
## 
## data:  table(survey$major)
## X-squared = 130.96, df = 3, p-value < 2.2e-16

I do find significant evidence that some majors are more common than others, X(3, N = 100) = 130.96, p < 0.01

In general, were students more likely to take a risk than not? Answer this question with a one-sample chi-square test

chisq.test(table(survey$risk))

## 
##  Chi-squared test for given probabilities
## 
## data:  table(survey$risk)
## X-squared = 17.64, df = 1, p-value = 2.669e-05

table(survey$risk) # To see see which value was more likely!

## 
##  0  1 
## 71 29

I do find significant evidence that students were LESS likely to take a risk than to not take a risk, X(1, N = 100) = 17.64, p < 0.01

Is there a relationship between hair color and students’ academic major? Answer this with a two-sample chi-square test

chisq.test(table(survey$major, survey$haircolor))

## 
##  Pearson's Chi-squared test
## 
## data:  table(survey$major, survey$haircolor)
## X-squared = 5.9227, df = 9, p-value = 0.7476

I do not find significant evidence that there is a relationship between hair color and students’ academic major, X(9, N = 100) = 5.92, p = 0.75

You pick the test!

Is there a relationship between whether a student has ever smoked marijuana and his/her decision to accept or reject the risky gamble?

chisq.test(table(survey$marijuana, survey$risk))

## 
##  Pearson's Chi-squared test with Yates' continuity correction
## 
## data:  table(survey$marijuana, survey$risk)
## X-squared = 0.56829, df = 1, p-value = 0.4509

I do NOT find significant evidence that there is a relationship between marijuana use and people’s decision to accept a risky gamble, X(1, N = 100) = 0.57, p = 0.45

Do males and females have different numbers of sexual partners on average?

t.test(formula = partners ~ sex,
       data = survey)

## 
##  Welch Two Sample t-test
## 
## data:  partners by sex
## t = 0.9219, df = 93.765, p-value = 0.3589
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.7615073  2.0815073
## sample estimates:
## mean in group f mean in group m 
##            7.54            6.88

I do NOT find significant evidence that men and women have different numbers of sexual partners on average, t(93.77) = 0.92, p = 0.36

Do males and females differ in how likely they are to have smoked marijuana?

chisq.test(table(survey$marijuana, survey$sex))

## 
##  Pearson's Chi-squared test with Yates' continuity correction
## 
## data:  table(survey$marijuana, survey$sex)
## X-squared = 0.16006, df = 1, p-value = 0.6891

I do NOT find significant evidence that men and women differ in their likelihood to smoke marijuana, X(1, N = 100) = 0.16, p = 0.69

Do people who have smoked marijuana have different logic scores on average than those who never have smoked marijuana?

t.test(formula = logic ~ marijuana,
       data = survey)

## 
##  Welch Two Sample t-test
## 
## data:  logic by marijuana
## t = 0.95997, df = 90.88, p-value = 0.3396
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.4373672  1.2554465
## sample estimates:
## mean in group 0 mean in group 1 
##        9.945510        9.536471

I do NOT find significant evidence that people who have smoked marijuana have different logic scores than those who have never smoked marijuana, t(90.88) = 0.96, p = 0.34

Checkpoint!!!

Anscombe’s Famous data quartet

In the next few questions, we’ll explore Anscombe’s famous data quartet. This famous dataset will show you the dangers of interpreting statistical tests (like a correlation test), without first plotting the data!

Run the following code to create the anscombe dataframe. This dataframe contains 11 pairs of data for four different datasets: A, B, C and D

# JUST COPY, PASTE, AND RUN!

anscombe <- data.frame(x = c(c(10, 8, 13, 9, 11, 14, 6, 4, 12, 7, 5),
                             c(10, 8, 13, 9, 11, 14, 6, 4, 12, 7, 5),
                             c(10, 8, 13, 9, 11, 14, 6, 4, 12, 7, 5),
                             c(8, 8, 8, 8, 8, 8, 8, 19, 8, 8, 8)),
                       y = c(c(8.04, 6.95, 7.58, 8.81, 8.33, 9.96, 7.24, 4.26, 10.84, 4.82, 4.68),
                             c(9.14, 8.14, 8.74, 8.77, 9.26, 8.1, 6.13, 3.1, 9.13, 7.26, 4.74),
                             c(7.46, 6.77, 12.74, 7.11, 7.81, 8.84, 6.08, 5.39, 8.15, 6.42, 5.73),
                             c(6.58, 5.76, 7.71, 8.84, 8.47, 7.04, 5.25, 12.5, 5.56, 7.91, 6.89)),
                       set = rep(c("A", "B", "C", "D"), each = 11))

Look at the anscombe dataframe to see how it is structured. Then, calculate the correlation between x and y separately for each dataset. That is, what is the correlation between x and y for dataset A? What about for datasets B, C and then D? You don’t need to report full APA style for these tests, just make a note of the correlation coefficients. What do you notice about the correlation coefficients for each dataset? Would you conclude that these datasets are very similar or very different?

# Correlation for A
cor.test(formula = ~ x + y,
         data = anscombe,
         subset = set == "A")

## 
##  Pearson's product-moment correlation
## 
## data:  x and y
## t = 4.4844, df = 9, p-value = 0.001523
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.4612713 0.9549196
## sample estimates:
##       cor 
## 0.8311601

# Correlation for B
cor.test(formula = ~ x + y,
         data = anscombe,
         subset = set == "B")

## 
##  Pearson's product-moment correlation
## 
## data:  x and y
## t = 4.2386, df = 9, p-value = 0.002179
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.4239389 0.9506402
## sample estimates:
##       cor 
## 0.8162365

# Correlation for C
cor.test(formula = ~ x + y,
         data = anscombe,
         subset = set == "C")

## 
##  Pearson's product-moment correlation
## 
## data:  x and y
## t = 4.2394, df = 9, p-value = 0.002176
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.4240623 0.9506547
## sample estimates:
##       cor 
## 0.8162867

# Correlation for D
cor.test(formula = ~ x + y,
         data = anscombe,
         subset = set == "D")

## 
##  Pearson's product-moment correlation
## 
## data:  x and y
## t = 4.243, df = 9, p-value = 0.002165
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.4246394 0.9507224
## sample estimates:
##       cor 
## 0.8165214

All of the correlations are identical at r = 0.83!

Now it’s time to actually look at the data and see if your prediction holds up! Run the following code to generate a scatterplot of each data pair, what do you find?

# JUST COPY, PASTE, AND RUN!
library(ggplot2)    # Load ggplot2

ggplot(data = anscombe,
       aes(x = x, y = y, col = set)) + 
  geom_point(size = 2) + 
  facet_wrap(~set) + 
  labs(title = "Anscombe's Quartet", 
       subtitle = "Always look at your data before conducting hypothesis tests!",
       caption = "Anscombe Wikipedia page: https://en.wikipedia.org/wiki/Anscombe%27s_quartet") +
  theme_bw() +
  guides(colour = "none")

What you have just seen is the famous Anscombe’s quartet a dataset designed to show you how important is to always plot your data before running a statistical test!!! You can see more at the wikipedia page here: https://en.wikipedia.org/wiki/Anscombe%27s_quartet

You pick the test!

Do people with higher iq scores tend to perform better on the logic test that those with lower iq scores? (Note: this appears to be the same as question 9)

cor.test(~ iq + logic,
         data = survey)

## 
##  Pearson's product-moment correlation
## 
## data:  iq and logic
## t = 1.2065, df = 98, p-value = 0.2305
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.07727287  0.31003231
## sample estimates:
##       cor 
## 0.1209815

I do not find significant evidence that people with higher IQs perform better on a logic test than those with lower IQs, t(98) = 1.21, p = 0.23, 95% CI = (-0.08, 0.31)

Are Germans more likely than not to have tried marijuana? (Hint: this is a one-sample chi-square test on a subset of the original data)

survey_germans <- subset(survey, country == "germany")

chisq.test(table(survey_germans$marijuana))

## 
##  Chi-squared test for given probabilities
## 
## data:  table(survey_germans$marijuana)
## X-squared = 0.034483, df = 1, p-value = 0.8527

I do NOT find significant evidence that Germans are more likely than not to have smoked marijuana, X(1, N = 29) = 0.03, p = 0.85

Does the IQ of people with brown hair differ from blondes? (Hint: This is a two-sample t-test that requires you to use the subset() argument to tell R which two groups you want to compare)

t.test(formula = iq ~ haircolor,
       data = survey,
       subset = haircolor %in% c("blonde", "brown"))

## 
##  Welch Two Sample t-test
## 
## data:  iq by haircolor
## t = 0.50574, df = 59.053, p-value = 0.6149
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -1.757470  2.946358
## sample estimates:
## mean in group blonde  mean in group brown 
##             109.7500             109.1556

I do NOT find significant evidence that the mean IQ of people with blonde hair is different from those with brown hair, t(59.1) = 0.51, p = 0.61, 95% CI = (-1.76, 2.95)

Only for men from Switzerland, is there a relationship between age and IQ?

cor.test(formula = ~age + iq,
         data = survey,
         subset = country == "switzerland")

## 
##  Pearson's product-moment correlation
## 
## data:  age and iq
## t = 1.1545, df = 58, p-value = 0.253
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.1081605  0.3890006
## sample estimates:
##       cor 
## 0.1498806

I do NOT find evidence for a correlation between age and IQ in students from Switzerland, t(58) = 1.15, p = 0.25, 95% CI = (-0.11, 0.39)

Only for people who chose the risky gamble and have never tried marijuana, is there a relationship between iq and performance on the logic test?

cor.test(formula = ~ iq + logic,
         data = survey,
         subset = risk == 1 & marijuana == 0)

## 
##  Pearson's product-moment correlation
## 
## data:  iq and logic
## t = 1.4249, df = 10, p-value = 0.1846
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.2134024  0.7968450
## sample estimates:
##       cor 
## 0.4108122

I do NOT find evidence for a correlation between IQ and logic for people who chose the risky gambe and have never tried marijuana, t(10) = 1.42, p = 0.18, 95% CI = (-0.21, 0.80)

Predict the correlation

What do you predict the correlation will be between the following two vectors x and y? Test your prediction by conducting the appropriate test (hint: you may want to put the vectors into a dataframe by running df <- data.frame(x = x, y = y) before running the correlation test),

x <- rnorm(n = 100, mean = 10, sd = 10)
y <- rnorm(n = 100, mean = 50, sd = 10)

df <- data.frame(x = x, y = y)

cor.test(formula = ~ y + x,
         data = df)

## 
##  Pearson's product-moment correlation
## 
## data:  y and x
## t = -0.28626, df = 98, p-value = 0.7753
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.2240506  0.1684700
## sample estimates:
##         cor 
## -0.02890455

# Nope not significant!

What about the correlation between the following two vectors? Test your prediction by conducting the appropriate test.

x <- rnorm(n = 100, mean = 10, sd = 10)
y <- -5 * x

df <- data.frame(x = x, y = y)

cor.test(formula = ~ y + x,
         data = df)

## 
##  Pearson's product-moment correlation
## 
## data:  y and x
## t = -Inf, df = 98, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -1 -1
## sample estimates:
## cor 
##  -1

# A perfect negative correlation!

What about these? Test your prediction by conducting the appropriate test.

x <- rnorm(n = 100, mean = 10, sd = 10)
y <- x + rnorm(n = 100, mean = 50, sd = 10)

df <- data.frame(x = x, y = y)

cor.test(formula = ~ y + x,
         data = df)

## 
##  Pearson's product-moment correlation
## 
## data:  y and x
## t = 9.0889, df = 98, p-value = 1.148e-14
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.5534003 0.7703865
## sample estimates:
##       cor 
## 0.6763054

# A large positive correlation!

Conducting a simulation of p-values

Let’s run some simulations to see what p-values really mean. We’ll do this by drawing repeated samples from a world when the null hypothesis H0 is True, and other samples from a world when the null hypothesis is False, that is, when the alternative hypothesis H1 is True).

First, let’s draw a random sample from the world when the null hypothesis H0 is True. We’ll save the results as sample.H0. Then, we’ll conduct a one-sample t-test to test if the true population mean is really 0:

# Generate a vector of 20 samples from a normal distribution with mu = 0
#  Here, the null hypothesis H0: Mu = 0 IS true

sample.H0 <- rnorm(n = 20, mean = 0, sd = 1)


# Now look at the p-values from one-sample t-tests of each object
t.test(x = sample.H0, mu = 0, alternative = "two.sided")$p.value

## [1] 0.8422217

Copy and paste your code above several times and see what happens! As you’ll see, the value keeps changing, this is because you’re getting new samples each time. How often do you get a ‘significant’ p-value?
Now, let’s repeat the process in a world where the null hypothesis is False (that is, the true mean is 0.5). Run the following code several times to see what the p-values look like

# Generate a vector of 20 samples from a normal distribution with mu = 0.5
#  Here, the null hypothesis H0: Mu = 0 is FALSE

sample.H1 <- rnorm(n = 20, mean = 0.5, sd = 1)

# Now look at the p-values from one-sample t-tests of each object
t.test(x = sample.H1, mu = 0, alternative = "two.sided")$p.value

## [1] 0.007164242

It’s time to put it all together. In the code below, you’ll run a simulation where you repeat the process above 100 times for the H0 world, and 100 times for the H1 world
Visualize the distribution of p values from the simulation separately for each world (for example, you could create a boxplot)

ggplot(data = simulation,
       aes(x = world, y = p)) + 
  geom_boxplot()

Is the mean p-value from the H0 world significantly different from the mean p-value from the H1 world? Conduct the appropriate test!

t.test(formula = p ~ world,
       data = simulation)

## 
##  Welch Two Sample t-test
## 
## data:  p by world
## t = 10.353, df = 182.65, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.3093399 0.4550016
## sample estimates:
## mean in group H0 mean in group H1 
##        0.5098306        0.1276598

I do find evidence for a significant difference in p-values in the H0 world and the H1 world, t(182.65) = 10.35, p < .01, 95% CI = (0.31, 0.46)

Use your simulation to answer the following question: “Given that the null hypothesis is true, what is the probability of obtaining a p-value less than 0.05?” In other words, for all of your simulations in the H1 world, what percent resulted in a p-value less than 0.05? (Hint: use aggregate() or dplyr).

library(dplyr)

simulation %>% 
  filter(world == "H0") %>%
  summarise(mean(p < .05))

##   mean(p < 0.05)
## 1           0.03

# About 3% of the time (pretty close to 5%)

Now use your simulation to answer a slightly different question: “Given that a p-value is less than .05, what is the probability that the null hypothesis is really False?” In other words, for all the simulations that resulted in a p-value less than 0.05, what proportion were in the H1 world (and not in the H0 world)?

simulation %>% 
  filter(p < .05) %>%
  summarise(mean(world == "H1"))

##   mean(world == "H1")
## 1           0.9491525

# About 95% of the time

Now, run the simulation again, but this time instead of performing 100 simulations from both worlds, do 1,000 simulations from the H0 world, and 100 from the H1 world. Based on this simulation, now make the same calculations as the previous two questions. That is, “Given that the null hypothesis is true, what is the probability of obtaining a p-value less than 0.05?” and “Given that a p-value is less than 0.05, what is the probability that the null hypothesis is False?” Are you answers the same or different from before?

simulation %>% 
  filter(p < .05) %>%
  summarise(mean(world == "H1"))

##   mean(world == "H1")
## 1           0.5789474

# Only about half the time!

Based on what you’ve learned, what is the correct answer to the general question: “Given that the p-value from a hypothesis test is less than 0.05, what is the probability that the null hypothesis is True?”

It depends on the prior probability that the null hypothesis was true in the first place!

Submit!

Save and email your wpa_6_LastFirst.R file to me at nathaniel.phillips@unibas.ch.
Go to https://goo.gl/forms/b9dcRH6Ud3pDagOr1 to confirm your assignment submission.

WPA #6 – Hypothesis tests

Readings

Assignment Goals

Examples

Analyzing results from a survey

Data Description

Load the data into R and saving a local copy as a .txt file

How to report hypothesis tests

t-test

correlation test

chi-square test

You pick the test!

Checkpoint!!!

Anscombe’s Famous data quartet

You pick the test!

Predict the correlation

Conducting a simulation of p-values

Submit!