Skip to contents

Introduction

This vignette verifies the accuracy of the sample_size_nbinom function by comparing its theoretical predictions for average exposure, statistical information, and power against results from a large-scale simulation study.

We specifically test a scenario with:

  • Piecewise constant accrual rates.
  • Piecewise exponential dropout (constant in this example).
  • Negative binomial outcomes.
  • Fixed follow-up design.

Simulation design

The study sample size suggested by the Friede method is used to evaluate the accuracy of that method. The parameters for both the theoretical calculation and the simulation study are as follows:

Parameters

  • Rates: λ1=0.4\lambda_1 = 0.4 (Control), λ2=0.3\lambda_2 = 0.3 (Experimental).
  • Dispersion: k=0.5k = 0.5.
  • Power: 90%.
  • Alpha: 0.025 (one-sided).
  • Dropout: 10% per year adjusted to monthly rate (δ=0.1/12\delta = 0.1 / 12).
  • Trial Duration: 24 months.
  • Max Follow-up: 12 months.
  • Event Gap: 30 days (approx 0.082 years).
  • Accrual: Piecewise linear ramp-up over 12 months (Rate RR for 0-6mo, 2R2R for 6-12mo).

Theoretical calculation

First, we calculate the required sample size and expected properties using sample_size_nbinom.

# Parameters
lambda1 <- 0.4
lambda2 <- 0.3
dispersion <- 0.5
power <- 0.9
alpha <- 0.025
dropout_rate <- 0.1 / 12
max_followup <- 12
trial_duration <- 24
event_gap <- 20 / 30.42 # 20 days

# Accrual targeting 90% power
# We provide relative rates (1:2) and the function scales them to achieve power
accrual_rate_rel <- c(1, 2)
accrual_duration <- c(6, 6)

design <- sample_size_nbinom(
  lambda1 = lambda1, lambda2 = lambda2, dispersion = dispersion,
  power = power,
  alpha = alpha, sided = 1,
  accrual_rate = accrual_rate_rel,
  accrual_duration = accrual_duration,
  trial_duration = trial_duration,
  dropout_rate = dropout_rate,
  max_followup = max_followup,
  event_gap = event_gap,
  method = "friede"
)

# Extract calculated absolute accrual rates
accrual_rate <- design$accrual_rate

print(design)
#> Sample size for negative binomial outcome
#> ==========================================
#> 
#> Method:          friede
#> Sample size:     n1 = 211, n2 = 211, total = 422
#> Expected events: 1366.9 (n1: 763.1, n2: 603.8)
#> Power: 90%, Alpha: 0.025 (1-sided)
#> Rates: control = 0.4000, treatment = 0.3000 (RR = 0.7500)
#> Dispersion: 0.5000, Avg exposure (calendar): 11.42
#> Avg exposure (at-risk): n1 = 9.04, n2 = 9.54
#> Event gap: 0.66
#> Dropout rate: 0.0083
#> Accrual: 12.0, Trial duration: 24.0
#> Max follow-up: 12.0

Simulation results

We simulated 3,600 trials using the parameters defined above from the Friede method. This number of simulations was chosen to achieve a standard error for the power estimate of approximately 0.005 when the true power is 90% (0.9×0.1/3600=0.005\sqrt{0.9 \times 0.1 / 3600} = 0.005). The simulation script is located in data-raw/generate_simulation_data.R.

# Load pre-computed simulation results
results_file <- system.file("extdata", "simulation_results.rds", package = "gsDesignNB")

if (results_file == "" && file.exists("../inst/extdata/simulation_results.rds")) {
  results_file <- "../inst/extdata/simulation_results.rds"
}

if (results_file != "") {
  sim_data <- readRDS(results_file)
  results <- sim_data$results
  design_ref <- sim_data$design
} else {
  # Fallback if data is not available (e.g. not installed yet)
  # This block allows the vignette to build without the data, but warns.
  warning("Simulation results not found. Skipping verification plots.")
  results <- NULL
  design_ref <- design
}

1. Average exposure verification

We compare the theoretical average exposure calculated by sample_size_nbinom with the observed average exposure in the simulation.

# Theoretical Exposure
theo_exposure <- design_ref$exposure

# Observed Exposure (Average across all trials and arms)
# Note: Exposure is the same for both arms in this design (randomized)
obs_exposure <- mean(c(results$exposure_control, results$exposure_experimental))

comparison_exp <- data.frame(
  Metric = "Average Exposure",
  Theoretical = theo_exposure,
  Simulated = obs_exposure,
  Difference = obs_exposure - theo_exposure,
  Rel_Diff_Pct = 100 * (obs_exposure - theo_exposure) / theo_exposure
)

knitr::kable(comparison_exp, digits = 4, caption = "Comparison of Average Exposure")
Comparison of Average Exposure
Metric Theoretical Simulated Difference Rel_Diff_Pct
Average Exposure 11.4195 11.3598 -0.0597 -0.5232

The theoretical exposure should closely match the simulated average exposure.

2. Statistical information and variance

The theoretical variance of the log rate ratio estimator (Wald test) is given by:

Vtheo=1/μ1+kadjn1+1/μ2+kadjn2 V_{theo} = \frac{1/\mu_1 + k_{adj}}{n_1} + \frac{1/\mu_2 + k_{adj}}{n_2}

where μi=λi,efft\mu_i = \lambda_{i,eff} \bar{t} is the expected number of events per subject in group ii (using effective rates adjusted for event gaps), and kadj=kQk_{adj} = k \cdot Q is the dispersion parameter inflated for variable follow-up.

The dispersion parameter kk directly increases the variance of the estimator. In a standard Poisson model, k=0k=0, and the variance depends only on the expected number of events. With negative binomial dispersion (k>0k > 0), the variance is inflated, reflecting the extra variability (overdispersion) in the data.

We compare this with the variance of the estimates from the simulation.

# Theoretical Variance (from design object)
theo_var <- design_ref$variance

# Empirical Variance of the Log Hazard Ratio estimates
emp_var <- var(results$estimate, na.rm = TRUE)

# Average Squared Standard Error (mean of the estimated variances)
avg_se_sq <- mean(results$se^2, na.rm = TRUE)

comparison_var <- data.frame(
  Metric = c("Variance of Estimator"),
  Theoretical = theo_var,
  Empirical_Var = emp_var,
  Avg_Estimated_Var = avg_se_sq
)

knitr::kable(comparison_var, digits = 5, caption = "Comparison of Variance")
Comparison of Variance
Metric Theoretical Empirical_Var Avg_Estimated_Var
Variance of Estimator 0.00786 0.00755 0.00728
  • Empirical Var: The actual variability of the estimated log rate ratios across 3,600 trials.
  • Avg Estimated Var: The average of the variance estimates (SE2SE^2) produced by the Wald test in each trial.

While there is reasonable agreement, the sample size formula may slightly underestimate the variability of the test statistic and thus may overstate power.

3. Power verification

Finally, we compare the theoretical power with the empirical power (proportion of trials rejecting the null hypothesis).

# Theoretical Power
theo_power <- design_ref$power

# Empirical Power
emp_power <- mean(results$p_value < design_ref$inputs$alpha, na.rm = TRUE)

comparison_pwr <- data.frame(
  Metric = "Power",
  Theoretical = theo_power,
  Simulated = emp_power,
  Difference = emp_power - theo_power
)

knitr::kable(comparison_pwr, digits = 4, caption = "Comparison of Power")
Comparison of Power
Metric Theoretical Simulated Difference
Power 0.9 0.8739 -0.0261

A 95% confidence interval for the empirical power can be calculated to check if the theoretical power falls within the simulation error bounds.

binom.test(sum(results$p_value < design_ref$inputs$alpha, na.rm = TRUE), nrow(results))$conf.int
#> [1] 0.8626014 0.8845653
#> attr(,"conf.level")
#> [1] 0.95

Conclusion

The simulation results confirm that sample_size_nbinom reasonably predicts average exposure, variance (information), and power for this complex design with piecewise accrual and dropout. However, given the slight underpowering that the simulation study suggests, it may be useful to consider a larger sample size than the Friede approximation suggests, with power verified by simulation.