Power and Sample Size Through Simulation

Erin M. Buchanan

Harrisburg University

Libraries

library(faux) # cran
library(dplyr) # cran 
library(ggplot2) # cran
library(tidyr) # cran 

Power

  • Power is our ability to detect a specific effect if that effect exists
  • Power is a complex topic affected by:
    • Effect size
    • Sample size
    • Study design
    • Alpha
    • And more!

Calculating Power

  • Popular programs like G*Power or shiny apps
    • https://arcaldwell49.shinyapps.io/anova-exact/
    • https://sempower.shinyapps.io/sempower/
    • https://designingexperiments.com/
  • R packages like pwr and many others for specific design types
  • But what if you want to control for complex factors?

The faux Package

  • https://debruine.github.io/faux/ (click me)
  • So many great examples and excellent documentation
  • You can completely simulate data or simulate based on current data
  • Great for teaching examples as well!

The Mandela Effect

The Mandela Effect

  • A recent(ish) publication in Psychological Science examined the Mandala effect
  • “The Mandela effect is an Internet phenomenon describing shared and consistent false memories for specific icons in popular culture. The visual Mandela effect is a Mandela effect specific to visual icons (e.g., the Monopoly Man is falsely remembered as having a monocle) and has not yet been empirically quantified or tested. In Experiment 1 (N = 100 adults), we demonstrated that certain images from popular iconography elicit consistent, specific false memories.”
  • https://psyarxiv.com/nzh3s/

Simulate a Study

  • Dependent variable: Accuracy in guessing the right picture
  • Independent variable: Confidence in your choice
  • Independent variable: Familiarity with the choice
  • Independent variable: Test group

Starting Numbers

  • This part is the hard part - figure out what numbers sound reasonable for results in your study.
  • Find a starting set of numbers and then vary around them

Starting Numbers

  • Groups:

    • No knowledge of effect: M Accuracy = 50% (SD = 10%)
    • Heard of the effect: M Accuracy = 55% (SD = 10%)
    • Have read too many Reddit threads: M Accuracy = 60% (SD = 10%)
    • This part represents d = 0.5 for two of the group comparisons \(\frac{55-50}{10}\)
    • You have to determine if you think these groups vary on the other variables

Starting Numbers

  • Confidence:
    • Overall mean: 60% (SD = 10%)
    • Correlation with Accuracy: r = .60
  • Familiarity
    • Overall mean: 40% (SD = 10%)
    • Correlation with Accuracy: r = .40
  • Inter-correlation of confidence and familiarity: r = .60

Make the data with faux

df <- sim_design(
  # make between subjects groups 
  between = list(group = c("None", "Medium", "Reddit")),
  # make continuous variables everyone sees
  # always list DV first 
  within = list(vars = c("Accuracy", "Confidence", "Familiarity")), 
  # make the means for each group 
  mu = list(None = c(Accuracy = 50, Confidence = 60, Familiarity = 40),
            Medium = c(Accuracy = 55, Confidence = 60, Familiarity = 40),
            Reddit = c(Accuracy = 60, Confidence = 60, Familiarity = 40)), 
  # make the standard deviations for each group
  sd = list(None = c(Accuracy = 10, Confidence = 10, Familiarity = 10),
            Medium = c(Accuracy = 10, Confidence = 10, Familiarity = 10),
            Reddit = c(Accuracy = 10, Confidence = 10, Familiarity = 10)), 
  # make the correlations, based on "vars"
  # accuracy-con, accuracy-fam, con-fam
  r = list(None = c(.6, .4, .6),
           Medium = c(.6, .4, .6),
           Reddit = c(.6, .4, .6)), 
  n = 10,
  plot = TRUE
)

Make the data with faux

Write Your Analysis

overall_model <- lm(Accuracy ~ group + Confidence + Familiarity,
                    data = df)
summary(overall_model)

Call:
lm(formula = Accuracy ~ group + Confidence + Familiarity, data = df)

Residuals:
     Min       1Q   Median       3Q      Max 
-14.5205  -5.2440  -0.0029   4.0435  14.4801 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)   0.8332    10.1243   0.082 0.935067    
groupMedium  10.0005     3.4335   2.913 0.007442 ** 
groupReddit  13.7969     3.3853   4.076 0.000408 ***
Confidence    0.9112     0.1637   5.565  8.7e-06 ***
Familiarity  -0.1748     0.1675  -1.044 0.306611    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 7.533 on 25 degrees of freedom
Multiple R-squared:  0.6683,    Adjusted R-squared:  0.6152 
F-statistic: 12.59 on 4 and 25 DF,  p-value: 9.558e-06

Simulate Starting Numbers

  • Note: There are lots of ways to do this that are potentially faster
    • You could generate all the datasets and then loop over that list
    • Check out parallel functions in R
    • But I think well in loops so :shrug:

Simulate Starting Numbers

# set the number of sims
number_sims <- 100
# a place to hold the data 
results <- data.frame(
  "intercept" = 1:number_sims,
  "medium" = 1:number_sims, 
  "reddit" = 1:number_sims,
  "confidence" = 1:number_sims,
  "familiar" = 1:number_sims
)

Simulate Starting Numbers

# build the simulation loop
for (i in 1:number_sims){
  
  df <- sim_design(
  # make between subjects groups 
  between = list(group = c("None", "Medium", "Reddit")),
  # make continuous variables everyone sees
  # always list DV first 
  within = list(vars = c("Accuracy", "Confidence", "Familiarity")), 
  # make the means for each group 
  mu = list(None = c(Accuracy = 50, Confidence = 60, Familiarity = 40),
            Medium = c(Accuracy = 55, Confidence = 60, Familiarity = 40),
            Reddit = c(Accuracy = 60, Confidence = 60, Familiarity = 40)), 
  # make the standard deviations for each group
  sd = list(None = c(Accuracy = 10, Confidence = 10, Familiarity = 10),
            Medium = c(Accuracy = 10, Confidence = 10, Familiarity = 10),
            Reddit = c(Accuracy = 10, Confidence = 10, Familiarity = 10)), 
  # make the correlations, based on "vars"
  # accuracy-con, accuracy-fam, con-fam
  r = list(None = c(.6, .4, .6),
           Medium = c(.6, .4, .6),
           Reddit = c(.6, .4, .6)), 
  n = 10, 
  plot = FALSE)
  
  overall_model <- lm(Accuracy ~ group + Confidence + Familiarity,
                    data = df)
  
  results[i, ] <- summary(overall_model)$coefficients[ , 4]
}

View Results

alpha <- .05
results %>% 
  summarize(medium_p = sum(medium <= alpha), 
            reddit_p = sum(reddit <= alpha), 
            confidence_p = sum(confidence <= alpha), 
            familiar_p = sum(familiar <= alpha))
  medium_p reddit_p confidence_p familiar_p
1       23       71           79          6

Expand to Sample Size

# loop over sample sizes
sample_sizes <- seq(from = 10, to = 100, by = 10)
summary_by_sample <- data.frame(
  "medium_p" = 1:length(sample_sizes),
  "reddit_p" = 1:length(sample_sizes),
  "confidence_p" = 1:length(sample_sizes),
  "familiar_p" = 1:length(sample_sizes)
)

# build the sample loop
for (p in 1:length(sample_sizes)){
  # build the simulation loop
  for (i in 1:number_sims){
    
    df <- sim_design(
    # make between subjects groups 
    between = list(group = c("None", "Medium", "Reddit")),
    # make continuous variables everyone sees
    # always list DV first 
    within = list(vars = c("Accuracy", "Confidence", "Familiarity")), 
    # make the means for each group 
    mu = list(None = c(Accuracy = 50, Confidence = 60, Familiarity = 40),
              Medium = c(Accuracy = 55, Confidence = 60, Familiarity = 40),
              Reddit = c(Accuracy = 60, Confidence = 60, Familiarity = 40)), 
    # make the standard deviations for each group
    sd = list(None = c(Accuracy = 10, Confidence = 10, Familiarity = 10),
              Medium = c(Accuracy = 10, Confidence = 10, Familiarity = 10),
              Reddit = c(Accuracy = 10, Confidence = 10, Familiarity = 10)), 
    # make the correlations, based on "vars"
    # accuracy-con, accuracy-fam, con-fam
    r = list(None = c(.6, .4, .6),
             Medium = c(.6, .4, .6),
             Reddit = c(.6, .4, .6)), 
    n = sample_sizes[p], 
    plot = FALSE)
    
    overall_model <- lm(Accuracy ~ group + Confidence + Familiarity,
                      data = df)
    
    results[i, ] <- summary(overall_model)$coefficients[ , 4]
  }
  
  summary_by_sample[p , ] <- results %>% 
    summarize(medium_p = sum(medium <= alpha), 
              reddit_p = sum(reddit <= alpha), 
              confidence_p = sum(confidence <= alpha), 
              familiar_p = sum(familiar <= alpha))
}

summary_by_sample$sample_n <- sample_sizes 

View Results

summary_by_sample
   medium_p reddit_p confidence_p familiar_p sample_n
1        30       72           74          7       10
2        41       97           99          3       20
3        64      100          100          6       30
4        83      100          100         16       40
5        88      100          100         10       50
6        93      100          100         14       60
7        97      100          100         17       70
8        99      100          100         15       80
9        99      100          100         14       90
10       98      100          100         21      100

View Results

summary_by_sample %>% 
  pivot_longer(cols = -sample_n, 
              names_to = "Variable", 
              values_to = "Power") %>% 
ggplot(., aes(sample_n, Power, color = Variable)) + 
  geom_point() + 
  theme_classic() + 
  xlab("Sample Size") + 
  geom_hline(yintercept = 80) +
  theme(text = element_text(size = 20))

View Results

Final Thoughts

  • These effect sizes are large, consider cutting them all in half to estimate the potential non-replication of a published size
  • You can also extend these ideas to test various combinations of means, standard deviations, and correlations
  • Use this type of code to pre-register your power analysis