Sample size re-estimation example

library(gsDesignNB)
library(gsDesign)
library(data.table)
library(MASS)
library(gt)
library(gt)

Introduction

This vignette demonstrates a group sequential trial with a single interim analysis where unblinded sample size re-estimation (SSR) is performed.

We simulate a scenario where the control event rate is lower than assumed and the dispersion is higher than assumed. Both factors lead to slower information accumulation (or higher variance) than planned. We will show how to:

Monitor information accumulation at an interim analysis.
Re-estimate the required sample size (or duration) to achieve the target power.
Adjust the final analysis bounds if the final information differs from the plan.

Trial setup and initial design

Planned parameters:

Control rate (\(\lambda_1\)): 0.1 events/month
Experimental rate (\(\lambda_2\)): 0.075 events/month (Hazard Ratio = 0.75)
Dispersion (\(k\)): 0.5
Power: 90%
One-sided Type I error (\(\alpha\)): 0.025
Enrollment: 20 patients/month for 12 months (Total N = 240)
Study duration: 24 months

Actual parameters (simulation truth):

Control rate (\(\lambda_1\)): 0.08 events/month (Lower than planned)
Experimental rate (\(\lambda_2\)): 0.06 events/month (HR = 0.75 maintained)
Dispersion (\(k\)): 0.65 (Higher than planned)

Initial sample size calculation

# Planned parameters
lambda1_plan <- 0.1
lambda2_plan <- 0.075
k_plan <- 0.5
power_plan <- 0.9
alpha_plan <- 0.025
accrual_rate_plan <- 20
accrual_dur_plan <- 12
trial_dur_plan <- 24

# Calculate sample size
design_plan <- sample_size_nbinom(
  lambda1 = lambda1_plan,
  lambda2 = lambda2_plan,
  dispersion = k_plan,
  power = power_plan,
  alpha = alpha_plan,
  accrual_rate = accrual_rate_plan,
  accrual_duration = accrual_dur_plan,
  trial_duration = trial_dur_plan,
  max_followup = 12
)

# Convert to group sequential design
gs_plan <- design_plan |> 
   gsNBCalendar(
     k = 2,
     test.type = 4, # Non-binding futility
     alpha = alpha_plan,
     sfu = sfHSD,
     sfupar = -2, # Moderately aggressive alpha-spending
     sfl = sfHSD,
     sflpar = 1, # Pocock-like futility spending
     analysis_times = c(accrual_dur_plan - 2, trial_dur_plan)
   ) |> gsDesignNB::toInteger() # Round to integer sample size

summary(gs_plan)
#> Asymmetric two-sided with non-binding futility bound group sequential design
#> for negative binomial outcomes, 2 analyses, total sample size 882.0, 90 percent
#> power, 2.5 percent (1-sided) Type I error. Control rate 0.1000, treatment rate
#> 0.0750, risk ratio 0.7500, dispersion 0.5000. Accrual duration 12.0, trial
#> duration 24.0, max follow-up 12.0, average exposure 12.00. Randomization ratio
#> 1:1. Upper spending: Hwang-Shih-DeCani (gamma = -2) Lower spending:
#> Hwang-Shih-DeCani (gamma = 1)
#> Asymmetric two-sided with non-binding futility bound group sequential design
#> for negative binomial outcomes, 2 analyses, total sample size 882.0, 90 percent
#> power, 2.5 percent (1-sided) Type I error. Control rate 0.1000, treatment rate
#> 0.0750, risk ratio 0.7500, dispersion 0.5000. Accrual duration 12.0, trial
#> duration 24.0, max follow-up 12.0, average exposure 12.00. Randomization ratio
#> 1:1. Upper spending: Hwang-Shih-DeCani (gamma = -2) Lower spending:
#> Hwang-Shih-DeCani (gamma = 1)

gsBoundSummary(gs_plan,
    deltaname = "RR",
    logdelta = TRUE,
    Nname = "Information",
    timename = "Month",
    digits = 4,
    ddigits = 2) |> gt() |>
  tab_header(
    title = "Group Sequential Design Bounds for Negative Binomial Outcome",
    subtitle = paste0(
      "N = ", ceiling(gs_plan$n_total[gs_plan$k]),
      ", Expected events = ", round(gs_plan$nb_design$total_events, 1)
    )
  )

Analysis	Value	Efficacy	Futility
Group Sequential Design Bounds for Negative Binomial Outcome
N = 882, Expected events = 785.4
IA 1: 43%	Z	2.5498	0.7234
Information: 64.85	p (1-sided)	0.0054	0.2347
Month: 10	~RR at bound	0.7286	0.9141
	P(Cross) if RR=1	0.0054	0.7653
	P(Cross) if RR=0.75	0.4077	0.0556
Final	Z	2.0152	2.0152
Information: 149.77	p (1-sided)	0.0219	0.0219
Month: 24	~RR at bound	0.8481	0.8481
	P(Cross) if RR=1	0.0219	0.9781
	P(Cross) if RR=0.75	0.9027	0.0973

Simulation

We simulate the trial with the same rate ratio, but lower actual rates and a higher dispersion. We will limit the maximum sample size to 40% more than the planned size (i.e., max N = ) to avoid unbounded increases.

set.seed(1234)

# Actual parameters
lambda1_true <- 0.08 # Lower event rates, same rr
lambda2_true <- 0.06
k_true <- 0.65 # Higher dispersion

# Enrollment and rates for simulation
# We simulate a larger pool to allow for potential sample size increase
enroll_rate <- data.frame(rate = accrual_rate_plan, duration = accrual_dur_plan * 2) 
fail_rate <- data.frame(
  treatment = c("Control", "Experimental"),
  rate = c(lambda1_true, lambda2_true),
  dispersion = k_true
)
dropout_rate <- data.frame(
  treatment = c("Control", "Experimental"),
  rate = c(0, 0),
  duration = c(100, 100)
)

sim_data <- nb_sim(
  enroll_rate = enroll_rate,
  fail_rate = fail_rate,
  dropout_rate = dropout_rate,
  max_followup = trial_dur_plan, 
  n = 600 
)

# Limit to planned enrollment for the initial cut
# We will "open" more enrollment if needed
sim_data_planned <- sim_data[1:ceiling(design_plan$n_total), ]

Interim analysis

We perform an interim analysis at Month 10, which is 2 months prior to the end of planned enrollment (Month 12).

interim_time <- 10
interim_data <- cut_data_by_date(sim_data_planned, cut_date = interim_time)

# Summary
table(interim_data$treatment)
#> 
#>      Control Experimental 
#>           99           98
sum(interim_data$events)
#> [1] 69
mean(interim_data$tte)
#> [1] 5.086404

Information computation

We calculate the statistical information accumulated so far.

Blinded Information: Calculated assuming a common rate and dispersion, often used for monitoring without unblinding.

Unblinded Information: Calculated using the observed rates and dispersion in each group. This is the true Fisher information for the log rate ratio.

\[ \mathcal{I} = \left( \frac{1}{\text{Var}(\hat{\beta}_{trt})} \right) \]

# Blinded SSR
blinded_res <- blinded_ssr(
  data = interim_data,
  ratio = 1,
  lambda1_planning = lambda1_plan,
  lambda2_planning = lambda2_plan,
  power = power_plan,
  alpha = alpha_plan,
  accrual_rate = accrual_rate_plan,
  accrual_duration = accrual_dur_plan,
  trial_duration = trial_dur_plan
)

# Unblinded SSR
ssr_res <- unblinded_ssr(
  data = interim_data,
  ratio = 1,
  lambda1_planning = lambda1_plan,
  lambda2_planning = lambda2_plan,
  power = power_plan,
  alpha = alpha_plan,
  accrual_rate = accrual_rate_plan,
  accrual_duration = accrual_dur_plan,
  trial_duration = trial_dur_plan
)

cat("Blinded SSR N:", ceiling(blinded_res$n_total_blinded), "\n")
#> Blinded SSR N: 914
cat("Unblinded SSR N:", ceiling(ssr_res$n_total_unblinded), "\n")
#> Unblinded SSR N: 888
print(ssr_res)
#> $n_total_unblinded
#> (Intercept) 
#>         888 
#> 
#> $dispersion_unblinded
#> [1] 0.8891071
#> 
#> $lambda1_unblinded
#> (Intercept) 
#>  0.07878905 
#> 
#> $lambda2_unblinded
#> (Intercept) 
#>  0.05726336 
#> 
#> $info_fraction
#> [1] 0.0950141
#> 
#> $unblinded_info
#> [1] 12.06309
#> 
#> $target_info
#> [1] 126.9611

The estimated control rate is 0.079, which is lower than the planned 0.1. The information fraction is 0.095.