5 Fixed design

We will briefly review fixed design based on weighted logrank test and MaxCombo test.

Note

For fixed design, npsurvSS will be the tool for this training.

The fixed design part largely follows the concept described in Yung and Liu (2019).

library(gsDesign)
library(gsdmvn)
library(mvtnorm)
library(dplyr)

5.1 Summary of assumptions

For simplicity, we made a few key assumptions.

Balanced design (1:1 randomization ratio).
1-sided test.
Local alternative: variance under null and alternative are approximately equal.
Accrual distribution:
- Piecewise uniform using npsurvSS::create_arm()
- Poisson process with piecewise uniform enrollment using gsDesign::nSurv() Lachin and Foulkes (1986) and gsdmvn::gs_design_ahr()
Survival distribution: piecewise exponential
Loss to follow-up: exponential
No stratification
No cure fraction

Note

Some of these assumptions have been generalized in the literature and the R package gsdmvn. These assumptions will be used unless another clarification is made in specific sections.

5.2 Notation

We also define some commonly used notations as below.

\(\alpha\): Type I error
\(\beta\): Type II error or power (1 - \(\beta\))
\(z_\alpha\): upper \(\alpha\) percentile of standard normal distribution
\(z_\beta\): upper \(\beta\) percentile of standard normal distribution

For illustration purpose, we considered a 1-sided test with type I error at \(\alpha=0.025\) and \(1-\beta=80\%\) power. In R, it is easy to calculate \(z_\alpha\) and \(z_\beta\) as below.

z_alpha <- abs(qnorm(0.025))
z_alpha
#> [1] 1.959964

z_beta <- abs(qnorm(0.2))
z_beta
#> [1] 0.8416212

5.3 Sample size calculation

\(\theta\): effect size.

To calculate sample size, a key step is to define the effect size. For example, the effect size in two-sample t-test is \((\mu_1 - \mu_2)/\sigma\), where \(\mu_1\) and \(\mu_2\) are group mean and \(\sigma\) is pooled standard deviation.

\(n\): total sample size.
\(Z\): test statistics is asymptotic normal.
- Under null hypothesis: \(Z \sim \mathcal{N}(0, \sigma_0^2)\)
- Under alternative hypothesis: \(Z \sim \mathcal{N}(\sqrt{n}\theta, \sigma_1^2)\)

By assuming local alternative, we have

\[\sigma_0^2 \approx \sigma_1^2 = \sigma^2\] In this simplified case, the sample size can be calculated as

\[ n = \frac{4 (z_{\alpha}+z_{\beta})^{2}}{\theta^2} \] Here \(\theta\) is standardized treatment effect.

5.4 Two-sample t-test

Let’s revisit the two-sample t-test to make a connection between the math formula and R code.

In two-sample t-test, we have

\[\theta = \frac{\Delta}{\sigma}\], where \(\theta\) is difference of mean in two groups and \(\sigma\) is standard deviation of \(\theta\).

If we consider a scenarios with treatment effect at 0.5 with pooled standard deviation at 2.

Let’s calculate the sample size using the formula above.

# Effect size
theta <- 0.5 / 2

# Sample size formula
4 * (z_alpha + z_beta)^2 / theta^2
#> [1] 502.3283

The same assumption is used in gsDesign::nNormal().

gsDesign::nNormal(delta1 = 0.5, sd = 2, alpha = 0.025, beta = 0.2)
#> [1] 502.3283

stats::power.t.test() uses the t-distribution for test statistics that is recommended in practice. It provides a slightly larger sample size under the same study design scenario.

stats::power.t.test(delta = 0.5, sd = 2, sig.level = 0.05, power = 0.8)$n * 2
#> [1] 504.2562

5.5 Logrank test

Let’s also explore the number of events required for logrank test under the proportional hazards assumption. It is well known that the sample size calculation is “event-driven” in this special case.

This is a nice feature under the proportional hazards assumption. because the effect size for number of events is only depends on the hazard ratio.

Note

Notation: \(\text{HR}\) is the hazard ratio. \(d\) is the number of events.

The effect size is: \[\theta = \log{\text{HR}} / 2\]

Note

As an exercise, the readers can derive the effect size following Section 9.5 of the NCSU ST520 course notes. The effect size is in formula (9.3) of the course notes.

With the effect size, we can use the same formula to calculate the number of events (\(d\)) as below.

\[ d = \frac{4 (z_{\alpha}+z_{\beta})^{2}}{(\log{\text{HR}/2})^2} \]

After the total number of events is defined, the sample size (\(n\)) can be determined based on accrual distribution, survival distribution loss to follow-up distribution and study duration.

The sample size calculation has been implemented in

Note

For simplicity, we skip the details of sample size calculation discussed in Lachin and Foulkes (1986).

Let’s make connection between math formula and R code by considering a scenario with hazard ratio at 0.6.

# Effect size
theta <- log(0.6) / 2

# Number of Events
(z_alpha + z_beta)^2 / theta^2
#> [1] 120.3157

We compare the results using npsurvSS. The key argument in npsurvSS::create_arm() is surv_scale that defines the hazard rates in each arm.

# Define study design assumption for each arm
arm0 <- npsurvSS::create_arm(
  size = 1, accr_time = 6, surv_scale = 1,
  loss_scale = 0.1, follow_time = 12
)

arm1 <- npsurvSS::create_arm(
  size = 1, accr_time = 6, surv_scale = 0.6,
  loss_scale = 0.1, follow_time = 12
)

Then we can use npsurvSS::size_two_arm() to calculate number of events and sample size.

# Sample size for logrank test
npsurvSS::size_two_arm(arm0, arm1,
  power = 0.8, alpha = 0.025,
  test = list(
    test = "weighted logrank", weight = "1",
    mean.approx = "event driven"
  )
)
#>        n0        n1         n        d0        d1         d 
#>  68.12167  68.12167 136.24335  61.92878  58.38693 120.31570

The number of events from gsDesign::nSurv() is slightly smaller because gsDesign::nSurv() follows Lachin and Foulkes (1986) that relax the local alternative assumption and is recommended in practice.

gsDesign::nSurv(
  lambdaC = 1,
  hr = 0.6,
  eta = 0.1,
  alpha = 0.025,
  beta = 0.2,
  T = 18,
  minfup = 12
)
#> Fixed design, two-arm trial with time-to-event
#> outcome (Lachin and Foulkes, 1986).
#> Solving for:  Accrual rate 
#> Hazard ratio                  H1/H0=0.6/1
#> Study duration:                   T=18
#> Accrual duration:                   6
#> Min. end-of-study follow-up: minfup=12
#> Expected events (total, H1):        119.7983
#> Expected sample size (total):       135.6574
#> Accrual rates:
#>     Stratum 1
#> 0-6   22.6096
#> Control event rates (H1):
#>       Stratum 1
#> 0-Inf         1
#> Censoring rates:
#>       Stratum 1
#> 0-Inf       0.1
#> Power:                 100*(1-beta)=80%
#> Type I error (1-sided):   100*alpha=2.5%
#> Equal randomization:          ratio=1

5.6 Non-proportional hazards

Under proportional hazard assumption, we commonly used an event-driven approach for sample size calculation. (i.e., calculate the number of events, then derive the sample size.)

Event (\(d\)) to Sample size (\(n\)) or d->n

Under non-proportional hazards, the event-driven approach is not applicable. We need to derive the sample size first.

Sample size (\(n\)) to Event (\(d\)) or n->d

\(\tau_a\): accrual time
\(\tau_f\): follow-up time

Note

The two figures above are copied from Yung and Liu (2019).

Let’s consider a delayed effect scenario below to illustrate NPH and sample size calculation.

Duration of enrollment: 12 months
Enrollment rate: 500/12 per month

enrollRates <- tibble::tibble(Stratum = "All", duration = 12, rate = 500 / 12)
enrollRates
#> # A tibble: 1 × 3
#>   Stratum duration  rate
#>   <chr>      <dbl> <dbl>
#> 1 All           12  41.7

Failure rate in control group: log(2) / 15
- Median survival time: 15 months.
Hazard ratio:
- First 4 months: 1
- After 4 months: 0.6
Dropout Rate: 0.001

failRates <- tibble::tibble(
  Stratum = "All",
  duration = c(4, 100),
  failRate = log(2) / 15, # Median survival 15 months
  hr = c(1, .6), # Delay effect after 4 months
  dropoutRate = 0.001
)
failRates
#> # A tibble: 2 × 5
#>   Stratum duration failRate    hr dropoutRate
#>   <chr>      <dbl>    <dbl> <dbl>       <dbl>
#> 1 All            4   0.0462   1         0.001
#> 2 All          100   0.0462   0.6       0.001

The figure below illustrates the survival probability over time in two treatment groups.

5.7 Weighted logrank test

For the weighted logrank test, we first illustrate it by using the Fleming-Harrington weight.

\[FH^{\rho, \gamma}(t) = S(t)^\rho(1 - S(t))^\gamma\]

To demonstrate the weight function, we covert the input of enrollRates and failRates as required by npsurvSS.

# Define study design object in each arm
gs_arm <- gsdmvn:::gs_create_arm(
  enrollRates,
  failRates,
  ratio = 1, # Randomization ratio
  total_time = 36 # Total study duration
)
arm0 <- gs_arm[["arm0"]]
arm1 <- gs_arm[["arm1"]]

Note

Note: in npsurvSS, p1_q0 is for \(\rho=q=0\) and \(\gamma=p=1\)

FH(0,1): Place more weights on later time points

npsurvSS::size_two_arm(arm0, arm1,
  power = 0.8, alpha = 0.025,
  test = list(test = "weighted logrank", weight = "FH_p1_q0")
)
#>        n0        n1         n        d0        d1         d 
#> 138.39353 138.39353 276.78707 102.15617  81.23792 183.39408

FH(1,1): Place more weights on middle time points

npsurvSS::size_two_arm(arm0, arm1,
  power = 0.8, alpha = 0.025,
  test = list(test = "weighted logrank", weight = "FH_p1_q1")
)
#>        n0        n1         n        d0        d1         d 
#> 130.75651 130.75651 261.51302  96.51884  76.75493 173.27377

FH(1,0): Place more weights on earlier time points

npsurvSS::size_two_arm(arm0, arm1,
  power = 0.8, alpha = 0.025,
  test = list(test = "weighted logrank", weight = "FH_p0_q1")
)
#>       n0       n1        n       d0       d1        d 
#> 237.5982 237.5982 475.1964 175.3848 139.4717 314.8565

FH(0,0) or logrank test

# Sample size for logrank test
npsurvSS::size_two_arm(arm0, arm1,
  power = 0.8, alpha = 0.025,
  test = list(test = "weighted logrank", weight = "FH_p0_q0")
)
#>        n0        n1         n        d0        d1         d 
#> 164.97626 164.97626 329.95252 121.77839  96.84215 218.62055

5.7.1 Effect size

In Yung and Liu (2019), section 2.3.3, it is shown that the test statistics \(Z\rightarrow_d\mathcal{N}(\sqrt{n}\theta, 1)\) approximately, where \(\theta = \Delta/\sigma\) is the effect size. Here the test statistics for weighted logrank test is:

\[ Z=\sqrt{\frac{n_{0}+n_{1}}{n_{0}n_{1}}}\int_{0}^{\tau}w(t)\frac{\overline{Y}_{0}(t)\overline{Y}_{1}(t)}{\overline{Y}_{0}(t)+\overline{Y}_{0}(t)}\left\{ \frac{d\overline{N}_{1}(t)}{\overline{Y}_{1}(t)}-\frac{d\overline{N}_{0}(t)}{\overline{Y}_{0}(t)}\right\} \]

Weight \(w(t)\): implemented in gsdmvn.
Integration up to total follow-up time \(\tau\).

weight <- function(x, arm0, arm1) {
  gsdmvn::wlr_weight_fh(x, arm0, arm1, rho = 0, gamma = 1)
}

\[\Delta=\int_{0}^{\tau}w(s)\frac{p_{0}\pi_{0}(s)p_{1}\pi_{1}(s)}{\pi(s)}\{\lambda_{1}(s)-\lambda_{0}(s)\}ds\],

delta <- abs(gsdmvn:::gs_delta_wlr(
  arm0, arm1,
  tmax = arm0$total_time,
  weight = weight
))
delta
#> [1] 0.02623776

\[\sigma^{2}=\int_{0}^{\tau}w(s)^{2}\frac{p_{0}\pi_{0}(s)p_{1}\pi_{1}(s)}{\pi(s)^{2}}dv(s)\]

sigma2 <- gsdmvn:::gs_sigma2_wlr(
  arm0, arm1,
  tmax = arm0$total_time,
  weight = weight
)
sigma2
#> [1] 0.0242674

Below are definitions of each component.

\(n = n_0 + n_1\): Total sample size
\(p_0 = n_0/n\), \(p_1 = n_1/n\): Randomization probability inferred by randomization ratio
\(\pi_0(t) = E\{N_0(t)\}\), \(\pi_1(t) = E\{N_1(t)\}\)
\(\pi(t) = p_0\pi_0(t)+p_1\pi_1(t)\): Probability of events
\(v(t) = p_0E\{Y_0(t)\}+p_1E\{Y_1(t)\}\): Probability of people at risk

5.7.2 Sample size and number of events

We can calculate sample size after deriving the effect size.

\[ n = \frac{\sigma^{2}(z_{\alpha}+z_{\beta})^{2}}{\Delta^2} = \frac{(z_{\alpha}+z_{\beta})^{2}}{\theta^2} \]

z_alpha <- qnorm(1 - 0.025)
z_beta <- qnorm(1 - 0.2)
n <- sigma2 * (z_alpha + z_beta)^2 / delta^2
n
#> [1] 276.6798

# Sample size for FH(0,1)
npsurvSS::size_two_arm(arm0, arm1,
  power = 0.8, alpha = 0.025,
  test = list(test = "weighted logrank", weight = "FH_p1_q0")
)
#>        n0        n1         n        d0        d1         d 
#> 138.39353 138.39353 276.78707 102.15617  81.23792 183.39408

The number of events can also be calculated as below. More details can be found in the technical details.

n0 <- n1 <- n / 2
gsdmvn:::prob_event.arm(arm0, tmax = arm0$total_time) * n0 +
  gsdmvn:::prob_event.arm(arm1, tmax = arm0$total_time) * n1
#> [1] 183.323

5.8 Technical details

5.8.1 Accrual and follow-up time

\(R\): time of study entry
\(F_R(\cdot)\) CDF of \(R\)

x <- seq(0, arm0$total_time, 0.1)
# Piecewise Uniform Distribution
plot(x, npsurvSS::daccr(x, arm = arm0),
  type = "s",
  xlab = "Calendar time (month)",
  ylab = "Accrual Probability"
)

\(\tau_a\): accrual time

arm0$accr_time
#> [1] 12

\(\tau_f\): follow-up time

arm0$follow_time
#> [1] 24

\(\tau = \tau_a + \tau_f\): total study duration

arm0$total_time
#> [1] 36

5.8.2 Event time

\(T\): time to event from study entry.
\(S_T(\cdot)\): Survival function
\(F_T(\cdot)\): CDF of \(T\)
\(f_T(\cdot)\): Density function

# Survival function of time to event from study entry
# Piecewise Exponential distribution
plot(x, 1 - npsurvSS::psurv(x, arm0),
  type = "l",
  xlab = "Calendar time (month)",
  ylab = "Survival Probability",
  ylim = c(0, 1)
)
lines(x, 1 - npsurvSS::psurv(x, arm1), lty = 2)
legend("topright", lty = c(1, 2), legend = c("control", "experiment"))

5.8.3 Censoring time

\(L\): time to loss of follow-up from study entry
\(S_L(\cdot)\): Survival function of \(L\)

# PDF of the time to loss of follow-up from study entry
# Exponential Distribution
plot(x, 1 - pexp(x),
  type = "l",
  xlab = "Calendar time (month)",
  ylab = "Loss to Follow-up Probability"
)

\(C = \min(L, \tau - R)\): time to censoring from study entry.

5.8.4 Observed time

\(U = \min(T,C)\): observed time.
\(\delta = I(T<C)\): event indicator

5.8.5 Expected events

\(N(t) = I(U \le t, \delta = 1)\): counting process
\(E\{N(t)\}\): expected probability of events

\[E\{N(t)\} = \int_{0}^{t}f_{T}(s)S_{L}(s)F_{R}(\tau_{a}\land(\tau-s))ds\]

# Probability of Events
x_int <- 0:arm0$total_time
plot(x_int, gsdmvn:::prob_event.arm(arm0, tmax = x_int),
  type = "l",
  xlab = "Calendar time (month)",
  ylab = "Event Probability",
  ylim = c(0, 1)
)

\(Y(t) = I(U\ge t)\): at-risk process
\(E\{Y(t)\}\): expected probability at risk

\[E\{Y(t)\} = S_{T}(s)S_{L}(s)F_{R}(\tau_{a}\land(\tau-s))\]

# Probability of People at risk
plot(x_int, gsdmvn:::prob_risk(arm0, x_int, arm0$total_time),
  type = "l",
  xlab = "Calendar time (month)",
  ylab = "Probability at Risk",
  ylim = c(0, 1)
)

5.8.6 Weight function in weighted logrank test

Define different weight functions in gsdmvn
Weight function: \(w(t)\)
Constant weight (logrank test): \(1\)

gsdmvn::wlr_weight_1(x = c(12, 24, 36), arm0, arm1)
#> [1] 1

Fleming-Harrington weight: \(w(t) = FH^{\rho, \gamma}(t) = S(t)^\rho(1 - S(t))^\gamma\)

weight_legend <- apply(
  expand.grid(
    c("rho=0", "rho=1"),
    c("gamma=0", "gamma=1")
  ),
  1,
  paste0,
  collapse = "; "
)
plot(
  1:36,
  gsdmvn::wlr_weight_fh(x = 1:36, arm0, arm1, rho = 0, gamma = 0),
  xlab = "Calendar time (month)", col = 1,
  ylab = "Weight", type = "l", ylim = c(-0.5, 1.2)
)
lines(
  1:36,
  gsdmvn::wlr_weight_fh(x = 1:36, arm0, arm1, rho = 1, gamma = 0),
  col = 2
)
lines(
  1:36,
  gsdmvn::wlr_weight_fh(x = 1:36, arm0, arm1, rho = 0, gamma = 1),
  col = 3
)
lines(
  1:36,
  gsdmvn::wlr_weight_fh(x = 1:36, arm0, arm1, rho = 1, gamma = 1),
  col = 4
)
legend("bottomright", legend = weight_legend, lty = 1, col = 1:4)

Modestly WLR: \(S^{-1}(t\land\tau_m)\) Magirr and Burman (2019).
Choose \(\tau_m\) around the change point in delay effect scenario.
Only down-weight early event. Constant weight after change point.

plot(
  1:36,
  gsdmvn::wlr_weight_fh(
    x = 1:36, arm0, arm1, rho = -1, gamma = 0, tau = 4
  ),
  xlab = "Calendar time (month)",
  ylab = "Weight",
  type = "l"
)

5.8.7 Average hazard ratio

\(\Delta\) is a weighted average of hazard function difference \(\lambda_{1}(\cdot)-\lambda_{0}(\cdot)\).

\[\Delta=\int_{0}^{\tau}w(s)\frac{p_{0}\pi_{0}(s)p_{1}\pi_{1}(s)}{\pi(s)}\{\lambda_{1}(s)-\lambda_{0}(s)\}ds\]

Taylor expansion builds a connection between \(\Delta\) and weighted average of log hazard ratio (AHR). It is a generalization of the Schoenfeld (1981) asymptotic expansion.

\[\Delta\approx\int_{0}^{\tau}w(s)\log\left(\frac{\lambda_1(s)}{\lambda_0(s)}\right)\frac{p_{0}\pi_{0}(s)p_{1}\pi_{1}(s)}{\pi(s)^2}v'(s)ds\]

Log of AHR can be estimated after normalizing the weights in \(\Delta\)

\[ \log{AHR} = \frac{\Delta}{\int_{0}^{\tau}w(s)\frac{p_{0}\pi_{0}(s)p_{1}\pi_{1}(s)}{\pi(s)^2}v'(s)ds} \]

t <- 1:36
log_ahr <- sapply(t, function(t) {
  gsdmvn:::gs_delta_wlr(arm0, arm1, tmax = t, weight = weight) /
    gsdmvn:::gs_delta_wlr(arm0, arm1,
      tmax = t, weight = weight,
      approx = "generalized schoenfeld", normalization = TRUE
    )
})
plot(t, exp(log_ahr),
  ylim = c(0.6, 1),
  xlab = "Calendar time (month)",
  ylab = "Average Hazard Ratio",
  type = "l"
)

Note the AHR depends on the choice of weight function \(w(\cdot)\).
Under logrank test with piecewise exponential distribution and 1:1 randomization ratio, it can be simplified to

\[\log(AHR) = \frac{\sum d_i \log{HR_i}}{\sum d_i}\]

gsDesign2::AHR(enrollRates, failRates, totalDuration = arm0$total_time)
#> # A tibble: 1 × 5
#>    Time   AHR Events  info info0
#>   <dbl> <dbl>  <dbl> <dbl> <dbl>
#> 1    36 0.683   331.  81.4  82.8

log_ahr <- gsdmvn:::gs_delta_wlr(
  arm0, arm1,
  tmax = arm0$total_time,
  weight = gsdmvn:::wlr_weight_1
) /
  gsdmvn:::gs_delta_wlr(
    arm0, arm1,
    tmax = arm0$total_time,
    weight = gsdmvn:::wlr_weight_1,
    approx = "generalized schoenfeld",
    normalization = TRUE
  )
exp(log_ahr)
#> [1] 0.6831735

Also computed by gsDesign2::AHR().

5.9 MaxCombo test

MaxCombo test statistics is the maximum of multiple WLR test statistics using different weight functions. We focus on Fleming and Harrington weight function \(FH(\rho, \gamma)\) defined in gsdmvn::wlr_weight_fh().

To calculate the sample size for the MaxCombo test, we will need to know the variance-covariance of multiple WLR test statistics.

For Fleming and Harrington weight function, the covariance of different WLR test statistics can be calculated. Based on Karrison (2016) and Wang, Luo, and Zheng (2019), the covariance of two test statistics \(Z_1 = Z(\rho_1, \gamma_1), Z_2 = Z(\rho_2, \gamma_2)\) is

\[\text{Cov}(Z_1, Z_2) = \text{Var}(Z(\frac{\rho_1 + \rho_2}{2}, \frac{\gamma_1 + \gamma_2}{2}))\]

We illustrate the idea based on \(FH(0, 0.5)\) and \(FH(0.5, 0.5)\)
All weight functions

weight_combo <- list(
  function(x, arm0, arm1) {
    gsdmvn::wlr_weight_fh(
      x, arm0, arm1,
      rho = 0, gamma = 0.5
    )
  },
  function(x, arm0, arm1) {
    gsdmvn::wlr_weight_fh(
      x, arm0, arm1,
      rho = 0.25, gamma = 0.5
    )
  },
  function(x, arm0, arm1) {
    gsdmvn::wlr_weight_fh(
      x, arm0, arm1,
      rho = 0.25, gamma = 0.5
    )
  },
  function(x, arm0, arm1) {
    gsdmvn::wlr_weight_fh(
      x, arm0, arm1,
      rho = 0.5, gamma = 0.5
    )
  }
)

\(\Delta\)

delta_combo <- sapply(weight_combo[c(1, 4)], function(x) {
  gsdmvn:::gs_delta_wlr(arm0, arm1, tmax = arm0$total_time, weight = x)
})
delta_combo
#> [1] -0.03980774 -0.02933963

Covariance

sigma2_combo <- sapply(weight_combo, function(x) {
  gsdmvn:::gs_sigma2_wlr(arm0, arm1, tmax = arm0$total_time, weight = x)
})
sigma2_combo <- matrix(sigma2_combo, nrow = 2)
sigma2_combo
#>            [,1]       [,2]
#> [1,] 0.05441018 0.04007109
#> [2,] 0.04007109 0.03014093

Correlation

corr_combo <- cov2cor(sigma2_combo)
corr_combo
#>          [,1]     [,2]
#> [1,] 1.000000 0.989493
#> [2,] 0.989493 1.000000

5.9.1 Type I error and power

The formula to calculate sample size based on effect size does not directly apply, because the test statistic for MaxCombo is not asymptotically normal.

The type I Error should be:

\[ \alpha = \text{Pr}(\max(Z_1, Z_2) > z_{\alpha} \, | \, H_0) = 1 - \text{Pr}(Z_1 < z_{\alpha}, Z_2 < z_{\alpha} \, | \, H_0) \]

z_alpha <- qmvnorm(p = 1 - 0.025, corr = corr_combo)
z_alpha$quantile
#> [1] 2.014555

The power should be

\[ \beta = \text{Pr}(\max(Z_1, Z_2) > z_{\alpha} \, | \, H_1) = 1 - \text{Pr}(Z_1 < z_{\alpha}, Z_2 < z_{\alpha} \, | \, H_1) \]

#' @param n Sample size
#' @param power Target power. Use power=0 to calculate the actual power.
power_combo <- function(n, power = 0) {
  theta <- abs(delta_combo) / sqrt(diag(sigma2_combo))
  power_combo <- 1 - pmvnorm(
    upper = z_alpha$quantile,
    mean = sqrt(n) * theta, corr = corr_combo
  )
  as.numeric(power_combo) - power
}
power_combo(n = 150)
#> [1] 0.5493368

5.9.2 Sample size

To calculate the sample size, we can solve the power function with a giving Type I error and power.

n_combo <- uniroot(power_combo, interval = c(0, 500), power = 0.8)$root
n_combo
#> [1] 271.0453

c(z = z_alpha$quantile, n = n_combo, power = power_combo(n_combo))
#>          z          n      power 
#>   2.014555 271.045320   0.800000

5.9.3 Number of events

The number of events required can be calculated accordingly as in weighted logrank test.

n0 <- n1 <- n_combo / 2
gsdmvn:::prob_event.arm(arm0, tmax = arm0$total_time) * n0 +
  gsdmvn:::prob_event.arm(arm1, tmax = arm0$total_time) * n1
#> [1] 179.5897