An unidentified man drinking a non-alcoholic beer.

An unidentified man drinking a non-alcoholic beer.

Readings

This assignment is based on the following readings:

Assignment Goals

Examples

# Vector of specific values

c(37, 45, 23, 54, 66)   # Numeric
c("A", "B", "C", "D")   # Character

# Vector of integers 1 to 5

c(1, 2, 3, 4, 5)                       # using c()
1:5                                    # using a:b
seq(from = 1, to = 5, by = 1)          # using seq()
seq(from = 1, to = 5, length.out = 5)  # same as above using length.out

# Vector of multiples of 10 from 10 to 50

c(10, 20, 30, 40, 50)
seq(from = 10, to = 50, by = 10)
seq(from = 10, to = 50, length.out = 5)

# Assign vectors to objects

data_A <- c(37, 45, 23, 54, 66)
data_B <- seq(from = 1, to = 100, by = 2)

# Calculate descriptive statistics

mean(c(37, 45, 23, 54, 66))
mean(data_A)

median(seq(from = 1, to = 100, by = 2))
median(data_B)

# Vector arithmetic

a <- c(1, 2, 3, 4, 5)
a * 10   # Multiply all elements by 10
a + .5   # Add .5 to all elements

b <- c(10, 20, 30, 40, 50)
a + b    # Add a and b element-wise

# Generate samples from distributions

sample(1:10, size = 2, replace = FALSE) # 2 values from the integers from 1 to 10
rnorm(n = 10, mean = 50, sd = 2)        # 10 values from Normal(mean = 50, sd = 2)
rbinom(n = 100, size = 1, prob = .5)    # 100 coin flips (binomial with size = 1, prob = .5)

Get started

  1. Open RStudio. Open a new R script (File – New File – R Script), and save it as wpa_1_LastFirst.R (where Last and First is your last and first name). At the top of your script write the assignment number, your name and date (as comments!). For the rest of the assignment, when you answer a task, indicate which task you are answering with appropriate comments as follows:

Here is an example of how your wpa_1_LastFirst.R file could look

# Assignment: WPA X
# Name: LAST, FIRST
# Date: DAY MONTH YEAR


# TASK 1
1 + 1

# TASK 2
2 + 2

# ...

Does drinking non-alcoholic beer affect cognitive performance?

A psychologist has a theory that some of the negative cognitive effects of alcohol are the result of psychological rather than physiological processes. To test this, she has 12 participants perform a cognitive test before and after drinking non-alcoholic beer which was labelled to contain 5% alcohol. Results from the study, including some demographic data, are presented in the following table. Note that higher scores on the test indicate better performance.

id before after age sex eye_color
1 45 43 20 male blue
2 49 50 19 female blue
3 40 61 22 male brown
4 48 44 20 female brown
5 44 45 27 male blue
6 70 20 22 female blue
7 90 85 22 male brown
8 75 65 20 female brown
9 80 72 25 male blue
10 65 65 22 female blue
11 80 70 24 male brown
12 52 75 22 female brown

Creating vectors from scratch

We’ll start by creating vector objects representing each vector of data (i.e.; column from the table above) from the study.

  1. Create a vector of the id data called id using the c() function.
id <- c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
  1. Now, create the id vector again, but this time use the a:b function.
id <- 1:12
  1. Now create the id vector again! But this time use the seq() function. To get help on this function, look at the help menu with ?seq
id <- seq(from = 1, to = 12, by = 1)

# or

id <- seq(from = 1, to = 12, length.out = 12)
  1. Create a vector of the before drink data called before using c().
before <- c(45, 49, 40, 48, 44, 70, 90, 75, 80, 65, 80, 52)
  1. Create a vector of the after drink data called after using c().
after <- c(43, 50, 61, 44, 45, 20, 85, 65, 72, 65, 70, 75)
  1. Create a vector of the age data called age using c().
age <- c(20, 19, 22, 20, 27, 22, 22, 20, 25, 22, 24, 22)
  1. Create a vector of the sex data called sex but don’t use just the c() function (that would be a lot of typing…). Instead, just repeat the vector c("male", "female") several times using the rep() function.
sex <- rep(c("male", "female"), times = 6)

# or

sex <- rep(c("male", "female"), length.out = 12)

sex
##  [1] "male"   "female" "male"   "female" "male"   "female" "male"  
##  [8] "female" "male"   "female" "male"   "female"
  1. Create a vector of the eye color data called eye_color using the rep() function.
eye_color <- rep(c("blue", "brown"), each = 2, times = 3)

# or

eye_color <- rep(c("blue", "brown"), each = 2, length.out = 12)

eye_color
##  [1] "blue"  "blue"  "brown" "brown" "blue"  "blue"  "brown" "brown"
##  [9] "blue"  "blue"  "brown" "brown"

Combining and changing vectors

  1. Create a new vector called age_months that shows the participants’ age in months instead of years. (Hint: Just multiply each age value by 12)
age_months <- age * 12

age_months
##  [1] 240 228 264 240 324 264 264 240 300 264 288 264
  1. Oops! It turns out that the watch used to measure time was off. All the before times are 1 second too fast, and all the after times are 1 second too slow. Correct these values by using simple arithmetic and then (re)assigning the objects with <-!
before <- before + 1

after <- after - 1
  1. Create a new vector called change that shows the change in participants’ scores from before to after (Hint: Just subtract one vector from the other)
change <- after - before

change
##  [1]  -4  -1  19  -6  -1 -52  -7 -12 -10  -2 -12  21
  1. Create a new vector called average that shows the participants’ average score across both tests. That is, the first element of average should be the average of the first participant’s two scores, and the second element should be the average of the second participant’s two scores…(Hint: Don’t use the mean() function! Instead, use basic arithmetic with + and /. That is, the elements of average should be before plus after divided by 2.)
average <- (before + after) / 2

average
##  [1] 44.0 49.5 50.5 46.0 44.5 45.0 87.5 70.0 76.0 65.0 75.0 63.5

Applying functions to vectors

  1. How many elements are in each of the original data vectors? (Hint: use length()). If the number of elements in each is not the same, you typed something in wrong!
length(id)
## [1] 12
length(before)
## [1] 12
length(after)
## [1] 12
length(age)
## [1] 12
length(sex)
## [1] 12
length(eye_color)
## [1] 12
  1. What was the standard deviation of ages? Assign the result to a scaler object called age_sd.
age_sd <- sd(age)

age_sd
## [1] 2.314316
  1. What is the median age? Assign the result to a scaler object called age_median.
age_median <- median(age)

age_median
## [1] 22
  1. How many people were there of each sex? (Hint: use table())
table(sex)
## sex
## female   male 
##      6      6
  1. What percent of people had each sex? (Hint: use table() then divide by its sum with sum())
table(sex) / sum(table(sex))
## sex
## female   male 
##    0.5    0.5

18.Calculate the mean of the sex column. What happens and why?

#mean(sex)

# Returns an error because sex is not a numeric vector!
  1. What was the mean before time? Assign the result to a scaler object called before_mean.
before_mean <- mean(before)

before_mean
## [1] 62.5
  1. What was the mean after time? Assign the result to a scaler object called after_mean.
after_mean <- mean(after)

after_mean
## [1] 56.91667
  1. What was the difference in the mean before times and the mean after times? Calculate this in two ways: once using the change vector, and once using the before_mean and after_mean objects. You should get the same answer for both!
after_mean - before_mean
## [1] -5.583333
mean(after - before)
## [1] -5.583333

CHECKPOINT!

Standardizing (z-scores) vectors

  1. Create a vector called before_z showing a standardized version of before. (Hint: Standardizing a variable means subtracting the mean, and then dividing the result by the standard deviation.).
before_z <- (before - mean(before)) / sd(before)

before_z
##  [1] -0.9624484 -0.7291275 -1.2540994 -0.7874577 -1.0207786  0.4958067
##  [7]  1.6624108  0.7874577  1.0791088  0.2041557  1.0791088 -0.5541369
  1. Create a vector called after_z showing a standardized version of after.
after_z <- (after - mean(after)) / sd(after)

after_z
##  [1] -0.8288297 -0.4398817  0.1713223 -0.7732657 -0.7177017 -2.1068017
##  [7]  1.5048584  0.3935783  0.7825263  0.3935783  0.6713983  0.9492183
  1. What was the largest before score? What was the largest before_z score?
max(before)
## [1] 91
max(before_z)
## [1] 1.662411
  1. What was the smallest after score? What was the smallest after_z score?
min(after)
## [1] 19
min(after_z)
## [1] -2.106802
  1. What should the mean and standard deviation of before_z and after_z be? Test your predictions by making the appropriate calculations.
# The mean should be 0, the sd should be 1

mean(before_z)
## [1] 1.619979e-17
sd(before_z)
## [1] 1
mean(after_z)
## [1] 1.526557e-16
sd(after_z)
## [1] 1
# The standard deviations are exactly 1 while the means are so small they are almost 0. The reason they aren't exactly 0 is for technical reasons: our computers can't calculate numbers with infinite precision, so numbers that "should" be 0 sometimes will just be very, very, very small.

Random samples from distributions

R has lots of functions for drawing random samples from probability distributions. For example, you can draw random samples from a vector with sample(), or draw random samples of values from a Normal distribution using the rnorm(n, mean, sd) function. Here are some examples:

# Draw 10 random numbers from the integers 1 to 100
sample(x = 1:100, size = 10)

# Simulate 10 flips from a fair coin
sample(x = c("H", "T"), size = 10, replace = TRUE)

# Random sample of 50 values from a Normal distribution with mean = 20 and sd = 10
rnorm(n = 50, mean = 20, sd = 10)
  1. Create a vector called samp_10 that contains 10 samples from a Normal distribution with a mean of 100 and a standard deviation of 10.
samp_10 <- rnorm(n = 10, mean = 100, sd = 10)

# Note: Due to random sampling, your results will not be identical to mine!
samp_10
##  [1] 101.86530  92.49926  79.05425 109.60317  94.36042 106.82970  95.47122
##  [8] 102.60231  98.64575  91.36954
  1. Create a vector called samp_1000 that contains 1,000 samples from the same Normal distribution as above (that is, also with a mean of 100 and standard deviation of 10).
samp_1000 <- rnorm(n = 1000, mean = 100, sd = 10)


# Note: Due to random sampling, your results will not be identical to mine!
samp_1000
##    [1] 106.10323 105.93540 100.29741  84.63443 105.76263  93.48527
##    [7]  80.95175  83.23115  92.32436 107.61754  94.19625  87.50555
##   [13]  99.80036 113.53397 116.49275 107.09329  86.87670 117.99554
##   [19]  97.63814  70.94461  97.64652 102.21451 101.62072 111.36955
##   [25]  90.71639 104.59006 103.66282  87.73348  87.40298  87.93346
##   [31]  92.35675  92.38747 109.00134 103.64349 109.36908 102.09110
##   [37] 105.03837  99.03862  92.75377  91.73709 106.91420  91.57118
##   [43] 100.57870  97.44799 114.29874 105.15441 106.40582 114.33046
##   [49] 111.30193  92.30755 108.37519  95.90231 105.41046 109.02793
##   [55] 105.86668  94.53300 103.18786 110.63025  82.92348 102.50831
##   [61] 105.16749 106.30249 108.13226 104.93909  99.54090 107.25326
##   [67]  87.22345 103.51628 108.82640 105.80219  97.39663 100.23845
##   [73] 106.28535 112.88290 103.68617  98.22018 106.89751 101.88763
##   [79] 106.27912  95.13972  97.69682  92.35489  99.27746  88.09492
##   [85] 104.23274  98.02754 110.84488  92.84229  96.63422  97.36863
##   [91] 113.38256  88.94129 105.22016 101.21855 119.65877  99.02649
##   [97]  92.18805  77.23145  96.94514 102.20820  96.71574  97.70806
##  [103]  89.32546 111.30810 108.46394 100.05855 101.70399 103.70630
##  [109] 105.97805 103.05225  97.42114 117.38009 108.52275  95.32831
##  [115] 106.03146 106.05432  87.75405 111.73360 102.34626  95.85129
##  [121]  93.52320  81.67281  98.37480  99.74882 100.55950  94.38135
##  [127] 111.06870 106.14082 103.13248 103.51887  88.44248  77.46527
##  [133] 107.10072 112.55028  96.33560 100.27805  85.70857 104.81754
##  [139] 110.07568 117.41311 113.36252 111.13684  79.63479 108.41148
##  [145]  89.08466  92.73196  99.76181  84.95839 104.68219 105.76990
##  [151] 111.32841 107.42008 109.78811  95.73439  98.78405 110.93099
##  [157] 109.39764 111.48889  96.62822 107.13779 107.84483  91.87643
##  [163] 114.49616 108.59800  90.47515  99.69274 114.38006 100.38087
##  [169] 107.19916  97.94950 108.03935  91.69772 106.20529  89.76287
##  [175] 105.38464  96.59816 113.87800 105.30245 105.58271  86.88872
##  [181] 104.04235  93.73459 104.62292  75.02585 114.09559 108.32437
##  [187]  80.22825 101.13090  77.32026  96.65156 108.32510  91.28177
##  [193] 113.92189  93.98437  92.57978 110.57007  96.04472 112.78758
##  [199]  83.49367 102.06754  98.58368  83.56192 105.46828 100.17617
##  [205]  99.79147  95.84566 104.60788  96.22392  96.48078  92.68538
##  [211] 126.20176 102.26742  98.41212  96.87007  99.70214 110.67796
##  [217]  98.23324 114.63518 113.52475 101.64243  99.27392  85.58734
##  [223]  93.85687 106.13806  85.08658 109.05566  97.08902  96.71626
##  [229]  94.98968 101.44552  94.09575 100.25048 100.97170 106.40768
##  [235]  97.36814 109.96020  87.33743 107.54978  87.17180  96.99032
##  [241] 102.41996 102.77369  94.80404  95.07521 109.47989 107.22658
##  [247]  93.92107  88.49080  87.98180 107.89043 106.61689 109.20577
##  [253] 103.26173  99.83419 100.09445  95.24328 106.07271  91.20573
##  [259]  97.68751  91.30521  96.65842 113.63922  97.39561  98.19088
##  [265]  72.66705 101.42312  77.89193 105.20669 111.60948 110.81373
##  [271]  96.68852  99.97041  94.71642  95.09083  99.44871 109.87327
##  [277]  91.32585 100.81786  93.44969  98.01592  91.32036  79.06785
##  [283]  92.02772 105.55429 107.23734  88.70071 100.11423  88.49030
##  [289] 100.43834 104.01325 110.57639 106.70966 114.18410 108.82337
##  [295] 113.08981  97.02426 107.37671 104.54172 102.74156 103.87062
##  [301]  97.90165  90.03304 104.50547  93.20163  95.28943  97.87551
##  [307] 100.87126 112.64925 105.08617  98.10377 102.84932 105.31679
##  [313] 105.73396  91.70311 108.71093  99.10153 117.13014 110.19215
##  [319]  82.00907  87.82914  89.61135 107.06787 105.22299 105.36453
##  [325]  95.47945  81.79006 110.07331 103.91020  93.85654  87.40605
##  [331] 101.71446  87.37370 100.53821  91.94221 107.17534  99.18375
##  [337] 102.08030  93.89270 109.03097 104.18927  90.83122 105.55957
##  [343] 108.30417 120.92038 105.76051 100.90715  98.73180  97.41194
##  [349] 113.72039 106.07517 104.44453  93.49271 102.18325 103.09444
##  [355]  98.41039 112.36254 102.69622 105.06533  93.74878  84.86026
##  [361] 104.05595  99.16207  83.46937 104.07917  90.87741  92.21363
##  [367]  96.77714  88.88396  98.41792 124.03495 108.30800  89.08660
##  [373]  98.53664  99.44067 102.38856  94.93748  85.93496  93.70561
##  [379] 122.14446  99.08599 102.65587 101.14667 109.19181 104.12177
##  [385] 114.54830  98.36635  96.12531  86.19623 109.80307  89.77474
##  [391]  91.15027 103.08609  76.61928 110.04514 102.31528 103.76854
##  [397]  90.45334  94.95973 111.98815  98.93303 110.07565  91.47001
##  [403] 114.36148 117.76776  99.19158 105.01023  92.60128  91.54272
##  [409] 111.27838  99.34479 112.16548 104.95611  93.00541  87.16218
##  [415]  98.29108 111.44830  84.71769  94.37296  92.58234  79.94886
##  [421] 106.94234  94.47604  99.42594 107.06680 116.61063  97.27046
##  [427]  83.70544  79.14431 105.42406  97.51763 118.08934  93.31012
##  [433] 114.94981 104.83027  89.52679 102.03454 114.04272  95.31351
##  [439] 104.06150 102.81272 103.06825  95.50898 100.68632 103.02042
##  [445]  97.86373 107.30881 114.22208 109.77367  96.02742  93.47601
##  [451] 102.89301 102.21590 110.78570  87.67579 115.39868  98.31871
##  [457]  97.34557 104.92340 104.36034 104.43194  88.46675  98.92928
##  [463] 102.18476 105.54179  95.72521  89.05612 102.63799 107.04770
##  [469]  95.08472  94.14670 107.57537 118.17892  92.21475 107.93722
##  [475]  89.78810 102.11526  98.19159  89.27461  86.92348  96.50745
##  [481] 109.65565 101.58066 100.54129  92.95656  93.33751 110.71320
##  [487]  95.09997 107.27365  95.62389 115.19742 109.35798  94.91373
##  [493]  96.39865 105.00722  93.82434 113.55584 108.07636  89.13240
##  [499]  89.45799 109.45812 116.51125  85.72515  96.52861 108.40154
##  [505] 106.37012 112.06702 107.03513  87.77145 107.84652 102.08602
##  [511]  89.50844  85.09280  97.13447 116.16664 104.21302  88.81930
##  [517] 101.07139 104.45716 112.91353  97.89700 100.05444  90.73139
##  [523]  86.95246 117.26574  97.08648  96.70704  88.92870  99.24536
##  [529]  99.15045  87.83919  72.53743 100.66686 106.56667 103.01308
##  [535]  85.77664  98.45309  96.54194 109.75989 112.61038  97.94212
##  [541] 109.16483 100.58479  95.06912  99.84306 104.15889 110.15287
##  [547]  99.33607  94.34424 113.05914 103.73358  85.31484  96.56245
##  [553] 101.88111 105.81077 102.29443  96.37860  94.85444 103.49077
##  [559] 104.99230 102.86745  84.95201  89.05721  95.32759  92.23157
##  [565]  97.18610 107.20007 105.35475  95.22637 115.48921 110.89853
##  [571]  94.62221  75.06035 120.88800  80.20150  99.87542  95.28560
##  [577] 104.11566 100.20686  96.23683 100.62013  87.80051 115.99562
##  [583]  97.07429  90.68018  90.97462 109.99422  91.46700  95.02327
##  [589]  97.33342  94.15205  99.97933 102.82063  95.10011  96.12397
##  [595] 100.95089  98.20455  95.54690  88.37604 100.52324 110.93911
##  [601]  99.56808 103.94640  95.44332 100.47898 108.18316 109.87993
##  [607]  88.02589  95.57294  98.59581  92.07658 100.07147 118.73774
##  [613]  95.79643  91.25343 113.29952 103.48964  97.89626  97.96140
##  [619] 100.26500 110.07924  98.95816  99.97566 114.80486  96.03190
##  [625]  94.96591 102.14531 107.80425  91.12687  94.89826 107.13138
##  [631] 107.96561  83.82382  94.85209  83.59123 112.36382 113.84126
##  [637] 112.60970  94.53718 101.94093  88.82681  98.47328  87.20764
##  [643] 101.17270  95.95942 104.21303 107.02119 111.75366  84.82362
##  [649]  73.71078  97.87584  98.79454 110.46064  94.07142  90.80321
##  [655]  81.89602 102.80908 105.78633  92.63374  98.44605 106.39942
##  [661]  91.72770 112.74781 105.17923 102.16362 101.15082 106.67827
##  [667] 106.82148 101.05633  94.98896 101.69620 102.47274 102.02268
##  [673] 107.05768  81.72910  93.04299  90.14376  95.10706 103.78870
##  [679] 110.12925  96.42172  89.30264  97.42617 106.75603 102.14605
##  [685]  90.70277  81.51805 103.89727 110.47586 112.80673 106.51041
##  [691]  93.54477 100.84408 107.71021  96.69719  93.37370  98.37468
##  [697] 123.64880 110.17358 112.94895  92.88651  96.44435  99.88899
##  [703]  93.38678  98.69196 113.89844 104.13395 103.67689 104.03416
##  [709] 108.01002  83.43191  99.89503  97.12731 115.21185 103.69499
##  [715] 119.94794 115.21923 102.94736 123.22322  85.68248  95.01880
##  [721]  93.44883  85.65642  91.33100  88.85013  96.60334 108.93377
##  [727]  90.64473 108.71217  91.76471  99.76521  91.37159 101.16251
##  [733] 105.75850 106.65232 116.75578  95.60873  95.87540  94.79502
##  [739]  96.66324 101.00168  95.71493  98.58164 107.03953 115.11477
##  [745]  99.21551  84.87076 123.89076 110.32125 101.67963 107.81301
##  [751]  88.08018 107.90979 107.38755 101.62025  90.89897 117.65833
##  [757] 111.14791 103.37189  96.41002  94.97868  87.30584  97.99323
##  [763] 102.85534 101.75863  85.72899 106.90862  91.75616 107.82262
##  [769]  84.83290  94.85330 105.97196  99.80232  93.68991  85.48891
##  [775]  89.58428  90.02835 111.21903 113.05724  99.08551 106.80445
##  [781]  88.95122  96.13628 109.78946  89.15774  62.54458  90.81345
##  [787]  98.53445  98.19076 103.10027 106.54530  88.88862  95.29535
##  [793]  90.30512  98.54305  95.19473  83.88163  87.94524  92.62788
##  [799] 104.27869  99.28754  96.86805 101.82959 107.52903  93.35269
##  [805] 106.41228 100.11820 102.48841  93.58471 103.47083 111.09802
##  [811]  94.14031  88.98698  89.90367  90.98467 101.24115 100.02536
##  [817] 112.51429 102.77307 104.21548 103.01230  71.90583 100.12601
##  [823] 131.56333  93.66857  99.07401  95.89005 107.98621  95.18859
##  [829] 108.67077  93.87066 109.09362 111.19760 100.66119  94.34462
##  [835] 107.70798  80.29972  99.52791  94.50355 111.65860  90.95046
##  [841]  93.21756 100.07376 100.35730  93.93458 104.34069  90.12275
##  [847]  74.53327 103.07963  91.02121 101.79373 109.63584  92.01718
##  [853] 110.92999  98.60941  97.36787  95.35979  98.81270  87.99608
##  [859]  86.47870  90.79713 108.23742 109.91838 108.75025 102.22932
##  [865]  99.18025  98.72041  92.49482 100.51223  89.00730 101.01744
##  [871] 103.71809  97.61022  76.59015  85.91915  87.48143  90.71922
##  [877] 117.54066  78.53025  90.28049  93.54765 107.45858  94.35606
##  [883]  80.02726 105.04882 106.47922 108.06118  95.04314 105.17478
##  [889] 101.40060  78.56790  91.14300  96.59432 118.11513  77.44749
##  [895]  92.80151  95.25889  91.46644  97.37139 106.88478  94.09061
##  [901]  90.01145  89.30300  77.35003  92.61591  86.92305  93.71598
##  [907]  89.55979 101.74293 105.85062  99.62268  98.64101 101.18876
##  [913] 119.52338  95.38601  89.51540  94.60488  96.51319  88.81678
##  [919] 106.23562 106.78903  94.03144 102.18041 112.24108  98.39054
##  [925] 106.84535 108.96451 112.67666  91.53740 110.09033 110.79811
##  [931]  83.68760  89.25613 103.06442  98.09487  97.21835 109.73094
##  [937]  94.68664 100.36444  96.67225  93.90149  94.00781  91.54598
##  [943]  94.53407  95.75691 106.62788  99.31083  98.53024  94.83721
##  [949]  90.39057  83.22747 105.18696  89.74320 104.52980 107.72122
##  [955]  97.58577 100.52888 104.33428 105.79831 108.36562 101.29283
##  [961] 108.19890 113.42559 112.55292 106.99181 102.11400 102.00406
##  [967] 113.30867  93.51107 108.29185 119.05476 101.68837  87.07109
##  [973] 104.49484  93.01161  84.90096  94.96558 101.27622 102.37744
##  [979]  92.96502  95.85629  92.47784 101.84452 117.47772  84.04540
##  [985]  97.62419 105.08796  98.42869  94.01004 110.01948  72.83519
##  [991]  97.38935  94.70860 110.44819 107.19172 105.28741  84.95692
##  [997] 105.86379 100.57024 103.67009  99.16341
  1. Before making any calculations, what would you guess the mean and standard deviations of samp_10 and samp_1000 are? If your predictions are the same, which vector’s mean and standard deviation do you expect to be closer to your predictions?
# The means "should" be 100 and the standard deviations "should" be 10.
# Because the sample size for samp_1000 is larger, I have more confident that the mean and standard deviation
#  will be closer to my expectation than samp_100
  1. Calculate the mean and standard deviations of samp_10 and samp_1000 separately. Was your prediction correct?
# Note: Due to random sampling, your results will not be identical to mine!
mean(samp_10)
## [1] 97.23009
sd(samp_10)
## [1] 8.794015
mean(samp_1000)
## [1] 99.73403
sd(samp_1000)
## [1] 9.279045
  1. Simulate 100 flips from a fair coin using sample() (Hint: include the arguments x = c("H", "T"), size = 1, replace = TRUE)
# Note: Due to random sampling, your results will not be identical to mine!
sample(x = c("H", "T"), size = 100, replace = TRUE)
##   [1] "H" "H" "H" "T" "H" "H" "T" "H" "T" "H" "T" "T" "H" "T" "T" "T" "H"
##  [18] "H" "H" "H" "H" "H" "T" "T" "T" "T" "H" "T" "T" "T" "T" "H" "T" "H"
##  [35] "T" "T" "T" "H" "T" "T" "H" "H" "T" "H" "H" "H" "H" "T" "H" "H" "T"
##  [52] "T" "T" "T" "H" "T" "H" "H" "T" "H" "H" "T" "H" "H" "T" "H" "H" "H"
##  [69] "T" "H" "T" "T" "H" "H" "T" "H" "H" "H" "H" "T" "H" "T" "H" "H" "T"
##  [86] "H" "H" "T" "H" "T" "T" "T" "H" "T" "T" "H" "H" "T" "H" "H"
  1. Simulate 100 flips from a biased coin where the probability of heads is 0.8 and the probability of tails is 0.2 (Hint: You can do this in two ways, either by including more heads than tails in the x argument, or by using the prob argument. Look at the help menu for the sample function for help.)
# Note: Due to random sampling, your results will not be identical to mine!
sample(x = c("H", "T"), size = 100, replace = TRUE, prob = c(.8, .2))
##   [1] "H" "T" "T" "H" "H" "H" "H" "H" "T" "T" "H" "H" "H" "H" "H" "H" "T"
##  [18] "T" "H" "H" "T" "T" "H" "H" "H" "H" "H" "H" "T" "H" "H" "H" "H" "T"
##  [35] "H" "H" "H" "H" "H" "H" "H" "H" "H" "T" "H" "H" "T" "H" "H" "H" "T"
##  [52] "T" "H" "H" "H" "H" "T" "H" "T" "T" "H" "H" "H" "H" "H" "H" "H" "H"
##  [69] "H" "H" "H" "H" "H" "H" "H" "H" "T" "H" "H" "H" "T" "H" "H" "H" "T"
##  [86] "H" "H" "T" "T" "H" "T" "T" "H" "H" "H" "H" "H" "H" "H" "H"

Bonus: The Room with 100 Boxes

Here is a fun little risky decision making game you can program in R using the sample() function. Imagine the following. There is a room with 100 boxes. 99 of the 100 boxes each contain 10 Thousand EUR, while 1 of the boxes contains a bomb.

Here’s the question…if you walked into the room with 100 boxes, how many would you want to open? If you don’t get the bomb, you can keep all of the money in the boxes you open. If you get the bomb, you get nothing (and die).

The code below will create a plot of the boxes game. If you’d like to, you could try running it in your R session to see the result.

# Plot of the Boxes Game

# Plotting space
plot(1, 
     xlim = c(0, 11), 
     ylim = c(0, 11),
     xlab = "", ylab = "", main = "The 100 Boxes Game!", 
     type = "n", xaxt = "n", yaxt = "n")

text(x = 5.5,
     y = 11, 
     labels = "There are 100 boxes\n99 / 100 contain 10,000 EUR (each) and 1 / 100 contains a bomb! How many will you open?", 
     font = 3,    # Italic font
     cex = .8)    # Slightly smaller font size

# Boxes
points(x = rep(1:10, times = 10), 
       y = rep(1:10, each = 10), 
       pch = 22, 
       cex = 4, 
       bg = sample(c(rep("green", 99), "red")))

# Labels
points(x = rep(1:10, times = 10), 
       y = rep(1:10, each = 10), 
       pch = "?")

Here’s how you can play the boxes game in R. First, create the room as an object room_100 which contains a vector with 99 values of 10 (representing 10 Thousand Euros) and one value of negative infinity (-Inf) which represents the bomb.

# This vector represents the room of 100 boxes
room_100 <- c(rep(10, 99), -Inf)

First, put the number of boxes you want to open as a new scaler object called open:

open <- 0  # How many do you want to open?

Now run the following code to see what you get!

# Play the Room with 100 Boxes Game!

result <- sample(x = room_100,  # Sample from the room...
                 size = open)

# Print what you got!

result       # Show what's in each box (1 means 10,000 EUR)
sum(result)  # Your total winnings!

You can also represent the boxes game by writing your own custom function in R. Run the following chunk to create the new function boxes_game. The code uses advanced functions like if() and function() that we haven’t learned yet, but feel free to take a closer look to try to understand the logic.

When you run the following code, ‘nothing’ will happen. But in fact, you are defining a new function called boxes_game that you can use later to actually play the game.

# Run this chunk to create the function
boxes_game <- function(open, 
                       room) {
  
# Outcome if no boxes are opened
  
if(open == 0) {
  
  print("You didn't open any boxes! You earned nothing but are still alive")}

# If at least 1 box is opened...
  
if(open > 0) {
  
  # Calculate the result
  result <- sample(x = room,
                   size = open)

# If -Inf (the bomb) is in the result...
if(-Inf %in% result) {
  
  print(paste("You're dead!!! You opened ", open, 
              " boxes and got the bomb!!!", sep = ""))}

# If -Inf (the bomb) is NOT in the result...
if((-Inf %in% result) == FALSE) {
  
  print(paste("Congratulations!!! You opened ", open, 
              " boxes and earned ", sum(result), 
              " Thousand Euros! Don't you want to play again? :)", sep = ""))}
}
  
}

Now you have defined the new function boxes_game(). To play the game, evaluate the function by specifying the two arguments: open is the the number of boxes you want to open, and room defines the room! For example, here’s how you’d play the game by opening 5 boxes in the room with 100 boxes:

# Play boxes game with 5 boxes in room_100

boxes_game(open = 5, 
           room = room_100)  

Play the game a few times and see how you do. When you are done, try creating another room called room_risky that contains only 10 values: 9 values of 1000 Thousand (aka, 1 Million) and 1 bomb. Try playing the game in this room a few times and see how your results change.

Submit!