Overview

The acronym NPES is short for non-proportional effect size. While it is motivated primarily by a use for when designing a time-to-event trial under non-proportional hazards (NPH), we have simplified and generalized the concept here. The model is likely to be useful for rank-based survival tests beyond the logrank test that will be considered initially by Tsiatis (1982). It could also be useful in other situations where treatment effect may vary over time in a trial for some reason. We generalize the framework of Chapter 2 of Proschan, Lan, and Wittes (2006) to incorporate the possibility of the treatment effect changing during the course of a trial in some systematic way. This vignettes addresses distribution theory and initial technical issues around computing

boundary crossing probabilities
bounds satisfying targeted boundary crossing probabilities

This is then applied to generalize computational algorithms provided in Chapter 19 of Jennison and Turnbull (2000) that are used to compute boundary crossing probabilities as well as boundaries for group sequential designs. Additional specifics around boundary computation, power and sample size are provided in a separate vignette.

The probability model

The continuous model and E-process

We consider a simple example here to motivate distribution theory that is quite general and applies across many situations. For instance Proschan, Lan, and Wittes (2006) immediately suggest paired observations, time-to-event and binary outcomes as endpoints where the theory is applicable.

We assume for a given integer \(N>0\) that \(X_{i}\) are independent, \(i=1,2,\ldots\). For some integer \(K\le N\) we assume we will perform analysis \(K\) times after \(0<n_1<n_2,\ldots ,n_K = N\) observations are available for analysis. Note that we have not confined \(n\le N\), but \(N\) can be considered the final planned sample size. Proschan, Lan, and Wittes (2006) refer to the estimation or E-process which we extend here to

\[\hat{\theta}_k = \frac{\sum_{i=1}^{n_k} X_{i}}{n_k}\equiv \bar X_{k}.\] While Proschan, Lan, and Wittes (2006) have used \(\delta\) instead of \(\theta\) in our notation, we stick more closely to the notation of Jennison and Turnbull (2000) where \(\theta\) is used. For our example, we see \(\hat{\theta}_k\equiv\bar X_k\) represents the sample average at analysis \(k\), \(1\le k\le K\). With a survival endpoint, \(\hat\theta_k\) would typically represent a Cox model coefficient representing the logarithm of the hazard ratio for experimental vs control treatment and \(n_k\) would represent the planned number of events at analysis \(k\), \(1\le k\le K.\) Denoting \(t_k=n_k/N\), we assume that for some real-valued function \(\theta(t)\) for \(t\ge 0\) we have for \(1\le k\le K\)

\[E(\hat{\theta}_k) =\theta(t_k) =E(\bar X_k).\] In the models of Proschan, Lan, and Wittes (2006) and Jennison and Turnbull (2000) we would have \(\theta(t)\) equal to some constant \(\theta\). We assume further that for \(i=1,2,\ldots\) \[\hbox{Var}(X_{i})=1.\] The sample average variance under this assumption is for \(1\le k\le K\)

\[\hbox{Var}(\hat\theta(t_k))=\hbox{Var}(\bar X_k) = 1/ n_k.\] The statistical information for the estimate \(\hat\theta(t_k)\) for \(1\le k\le K\) for this case is \[ \mathcal{I}_k \equiv \frac{1}{\hbox{Var}(\hat\theta(t_k))} = n_k.\] We now see that \(t_k\), \(1\le k\le K\) is the so-called information fraction at analysis \(k\) in that \(t_k=\mathcal{I}_k/\mathcal{I}_K.\)

Z-process

The Z-process is commonly used (e.g., Jennison and Turnbull (2000)) and will be used below to extend the computational algorithm in Chapter 19 of Jennison and Turnbull (2000) by defining equivalently in the first and second lines below for \(k=1,\ldots,K\)

\[Z_{k} = \frac{\hat\theta_k}{\sqrt{\hbox{Var}(\hat\theta_k)}}= \sqrt{\mathcal{I}_k}\hat\theta_k= \sqrt{n_k}\bar X_k.\]

The variance for \(1\le k\le K\) is \[\hbox{Var}(Z_k) = 1\] and the expected value is

\[E(Z_{k})= \sqrt{\mathcal{I}_k}\theta(t_{k})= \sqrt{n_k}E(\bar X_k) .\]

B-process

B-values are mnemonic for Brownian motion. For \(1\le k\le K\) we define \[B_{k}=\sqrt{t_k}Z_k\] which implies \[ E(B_{k}) = \sqrt{t_{k}\mathcal{I}_k}\theta(t_k) = t_k \sqrt{\mathcal{I}_K} \theta(t_k) = \mathcal{I}_k\theta(t_k)/\sqrt{\mathcal{I}_K}\] and \[\hbox{Var}(B_k) = t_k.\]

For our example, we have

\[B_k=\frac{1}{\sqrt N}\sum_{i=1}^{n_k}X_i.\] It can be useful to think of \(B_k\) as a sum of independent random variables.

Summary of E-, Z- and B-processes

Statistic	Example	Expected value	Variance
\(\hat\theta_k\)	\(\bar X_k\)	\(\theta(t_k)\)	\(\mathcal{I}_k^{-1}\)
\(Z_k=\sqrt{\mathcal{I}_k}\hat\theta_k\)	\(\sqrt{n_k}\bar X_k\)	\(\sqrt{\mathcal{I}_k}\theta(t_k)\)	\(1\)
\(B_k=\sqrt{t_k}Z_k\)	\(\sum_{i=1}^{n_k}X_i/\sqrt N\)	\(t_k\sqrt{\mathcal{I}_K}\theta(t_k)=\mathcal{I}_k\theta(t_k)/\sqrt{\mathcal{I}_K}\)	\(t_k\)

Conditional independence, covariance and canonical form

We assume independent increments in the B-process. That is, for \(1\le j < k\le K\) \[\tag{1} B_k - B_j \sim \hbox{Normal} (\sqrt{\mathcal{I}_K}(t_k\theta(t_k)- t_j\theta(t_j)), t_k-t_j)\] independent of \(B_1,\ldots,B_j\). As noted above, for a given \(1\le k\le K\) we have for our example \[B_j=\sum_{i=1}^{n_j}X_i / \sqrt N.\] Because of independence of the sequence \(X_i\), \(i=1,2,\ldots\), we immediately have for \(1\le j\le k\le K\) \[\hbox{Cov}(B_j,B_k) = \hbox{Var}(B_j) = t_j.\] This leads further to \[\hbox{Corr}(B_j,B_k)=\frac{t_j}{\sqrt{t_jt_k}}=\sqrt{t_j/t_k}=\hbox{Corr}(Z_j,Z_k)=\hbox{Cov}(Z_j,Z_k)\] which is the covariance structure in the so-called canonical form of Jennison and Turnbull (2000). For our example, we have \[B_k=\frac{1}{\sqrt N}\sum_{i=1}^{n_k}X_i\] and \[B_k-B_j=\frac{1}{\sqrt N}\sum_{i=n_j + 1}^{n_k}X_i\] and the covariance is obvious. We assume independent increments in the B-process that will be demonstrated for the simple example above. That is, for \(1\le j < k\le K\) \[\tag{1} B_k - B_j \sim \hbox{Normal} (\mathcal{I}_k\theta(t_k)- \mathcal{I}_j\theta(t_j), t_k-t_j)\] independent of \(B_1,\ldots,B_j\). For a given \(1\le j\le k\le K\) we have for our example \[B_j=\sum_{i=1}^{n_j}X_i / (\sqrt N).\] Because of independence of the sequence \(X_i\), \(i=1,2,\ldots\), we immediately have for \(1\le j\le k\le K\) \[\hbox{Cov}(B_j,B_k) = \hbox{Var}(B_j) = t_j/t_k =\mathcal{I}_j/\mathcal{I}_k.\] This leads to \[\mathcal{I}_j/\mathcal{I}_k=\sqrt{t_j/t_k}=\hbox{Corr}(B_j,B_k)=\hbox{Corr}(Z_j,Z_k)=\hbox{Cov}(Z_j,Z_k)\] which is the covariance structure in the so-called canonical form of Jennison and Turnbull (2000). The independence of \(B_j\) and \[B_k-B_j=\sum_{i=n_j + 1}^{n_k}X_i/(\sqrt N)\] is obvious for this example.

Test bounds and crossing probabilities

In this section we define notation for bounds and boundary crossing probabilities for a group sequential design. We also define an algorithm for computing bounds based on a targeted boundary crossing probability at each analysis. The notation will be used elsewhere for defining one- and two-sided group sequential hypothesis testing. A value of \(\theta(t)>0\) will reflect a positive benefit.

For \(k=1,2,\ldots,K-1\), interim cutoffs \(-\infty \le a_k< b_k\le \infty\) are set; final cutoffs \(-\infty \le a_K\leq b_K <\infty\) are also set. An infinite efficacy bound at an analysis means that bound cannot be crossed at that analysis. Thus, \(3K\) parameters define a group sequential design: \(a_k\), \(b_k\), and \(\mathcal{I}_k\), \(k=1,2,\ldots,K\).

Notation for boundary crossing probabilities

We now apply the above distributional assumptions to compute boundary crossing probabilities. We use a shorthand notation in this section to have \(\theta\) represent \(\theta()\) and \(\theta=0\) to represent \(\theta(t)\equiv 0\) for all \(t\). We denote the probability of crossing the upper boundary at analysis \(k\) without previously crossing a bound by

\[\alpha_{k}(\theta)=P_{\theta}(\{Z_{k}\geq b_{k}\}\cap_{j=1}^{i-1}\{a_{j}\le Z_{j}< b_{j}\}),\] \(k=1,2,\ldots,K.\)

Next, we consider analogous notation for the lower bound. For \(k=1,2,\ldots,K\) denote the probability of crossing a lower bound at analysis \(k\) without previously crossing any bound by

\[\beta_{k}(\theta)=P_{\theta}((Z_{k}< a_{k}\}\cap_{j=1}^{k-1}\{ a_{j}\le Z_{j}< b_{j}\}).\] For symmetric testing for analysis \(k\) we would have \(a_k= - b_k\), \(\beta_k(0)=\alpha_k(0),\) \(k=1,2,\ldots,K\). The total lower boundary crossing probability for a trial is denoted by \[\beta(\theta)\equiv\sum_{k=1}^{K}\beta_{k}(\theta).\] Note that we can also set \(a_k= -\infty\) for any or all analyses if a lower bound is not desired, \(k=1,2,\ldots,K\). For \(k<K\), we can set \(b_k=\infty\) where an upper bound is not desired. Obviously, for each \(k\), we want either \(a_k>-\infty\) or \(b_k<\infty\).

Recursive algorithms

We now provide a small update to the algorithm of Chapter 19 of Jennison and Turnbull (2000) to do the numerical integration required to compute the boundary crossing probabilites of the previous section and also identifying group sequential boundaries satisfying desired characteristics. The key to these calculations is the conditional power identitity in equation (1) above which allows building recursive numerical integration identities to enable simple, efficient numerical integration.

We define

\[g_1(z;\theta) = \frac{d}{dz}P(Z_1\le z) = \phi\left(z - \sqrt{\mathcal{I}_1}\theta(t_1)\right)\tag{2}\]

and for \(k=2,3,\ldots K\) we recursively define the subdensity function

\[\begin{align} g_k(z; \theta) &= \frac{d}{dz}P_\theta(\{Z_k\le z\}\cap_{j=1}^{k-1}\{a_j\le Z_j<b_j\}) \\ &=\int_{a_{k-1}}^{b_{k-1}}\frac{d}{dz}P_\theta(\{Z_k\le z |Z_{k-1}=z_{k-1}\})g_{k-1}(z_{k-1}; \theta)dz_{k-1}\\ &=\int_{a_{k-1}}^{b_{k-1}}f_k(z_{k-1},z;\theta)g_{k-1}(z_{k-1}; \theta)dz_{k-1}.\tag{3} \end{align} \] The bottom line notation here is the same as on p. 347 in Jennison and Turnbull (2000). However, \(f_k()\) here takes a slightly different form.

\[\begin{align} f_k(z_{k-1},z;\theta) &=\frac{d}{dz}P_\theta(\{Z_k\le z |Z_{k-1}=z_{k-1}\})\\ &=\frac{d}{dz}P_\theta(B_k - B_{k-1} \le z\sqrt{t_k}-z_{k-1}\sqrt{t_{k-1}})\\ &=\frac{d}{dz}\Phi\left(\frac{z\sqrt{t_k}-z_{k-1}\sqrt{t_{k-1}}-\sqrt{\mathcal{I}_K}(t_k\theta(t_k)- t_{k-1}\theta(t_{k-1}))}{\sqrt{t_k-t_{k-1}}}\right)\\ &=\frac{\sqrt{t_k}}{\sqrt{t_k-t_{k-1}}}\phi\left(\frac{z\sqrt{t_k}-z_{k-1}\sqrt{t_{k-1}}-\sqrt{\mathcal{I}_K}(t_k\theta(t_k)- t_{k-1}\theta(t_{k-1}))}{\sqrt{t_k-t_{k-1}}}\right)\\ &=\frac{\sqrt{\mathcal{I}_k}}{\sqrt{\mathcal{I}_k-\mathcal{I}_{k-1}}}\phi\left(\frac{z\sqrt{\mathcal{I}_k}-z_{k-1}\sqrt{\mathcal{I}_{k-1}}-(\mathcal{I}_k\theta(t_k)- \mathcal{I}_{k-1}\theta(t_{k-1}))}{\sqrt{\mathcal{I}_k-\mathcal{I}_{k-1}}}\right).\tag{3} \end{align}\]

We have worked towards this last line due to its comparability to equation (19.4) on p. 347 of Jennison and Turnbull (2000) which assumes \(\theta(t_k)=\theta\) for some constant \(\theta\); we re-write that equation slightly here as:

\[f_k(z_{k-1},z;\theta) = \frac{\sqrt{\mathcal{I}_k}}{\sqrt{\mathcal{I}_k-\mathcal{I}_{k-1}}}\phi\left(\frac{z\sqrt{\mathcal{I}_k}-z_{k-1}\sqrt{\mathcal{I}_{k-1}}-\theta(\mathcal{I}_k- \mathcal{I}_{k-1})}{\sqrt{\mathcal{I}_k-\mathcal{I}_{k-1}}}\right).\tag{4}\] This is really the only difference in the computational algorithm for boundary crossing probabilities from the Jennison and Turnbull (2000) algorithm. Using the above recursive approach we can compute for \(k=1,2,\ldots,K\)

\[\alpha_{k}(\theta)=\int_{b_k}^\infty g_k(z;\theta)dz\tag{5}\] and \[\beta_{k}(\theta)=\int_{-\infty}^{a_k} g_k(z;\theta)dz.\tag{6}\]

Deriving spending boundaries

We can now derive boundaries satisfying given boundary crossing probabilities using equations (2-6) above. Suppose for we have specified \(b_1,\ldots,b_{k-1}\) and \(a_1,\ldots,a_{k-1}\) and now wish to derive \(a_k\) and \(b_k\) such that equations (5) and (6) hold. We write the upper bound as a function of the probability of crossing we wish to derive.

\[\pi_k(b;\theta) = \int_b^\infty g_k(z;\theta)dz\] \[\pi_k^\prime(b;\theta) =\frac{d}{db}\pi_k(b;\theta)= -g_k(b; \theta).\tag{7}\] If we have a value \(\pi_k(b^{(i)};\theta)\) we can use a first order Taylor’s series expansion to approximate

\[\pi_k(b;\theta)\approx \pi_k(b^{(i)};\theta)+(b-b^{(i)})\pi_k^\prime(b^{(i)};\theta)\] We set \(b^{(i+1)}\) such that

\[\alpha_k(\theta)=\pi_k(b^{(i)}; \theta) + (b^{(i+1)}-b^{(i)})\pi^\prime(b^{(i)};\theta).\]

Solving for \(b^{(i+1)}\) we have

\[b^{(i+i)} = b^{(i)} + \frac{\alpha_k(\theta) - \pi_k(b^{(i)};\theta)}{\pi_k^\prime(b^{(i)}; \theta)}= b^{(i)} - \frac{\alpha_k(\theta) - \pi_k(b^{(i)};\theta)}{g_k(b^{(i)}; \theta)}\tag{8}\] and iterate until \(|b^{(i+1)}-b^{(i)}|<\epsilon\) for some tolerance level \(\epsilon>0\) and \(\pi_k(b^{(i+1)};\theta)-\alpha_k(\theta)\) is suitably small. A simple starting value for any \(k\) is

\[b^{(0)} = \Phi^{-1}(1- \alpha_k(\theta)) + \sqrt{\mathcal{I}_k}\theta(t_k).\tag{9}\] Normally, \(b_k\) will be calculated with \(\theta(t_k)=0\) for \(k=1,2,\ldots,K\) which simplifies the above. However, \(a_k\) computed analogously will often use a non-zero \(\theta\) to enable so-called \(\beta\)-spending.

Numerical integration

The numerical integration required to compute boundary probabilities and derive boundaries is the same as that defined in section 19.3 of Jennison and Turnbull (2000). The single change is the replacement of the non-proportional effect size assumption of equation (3) above replacing the equivalent of equation (4) used for a constant effect size as in Jennison and Turnbull (2000).

References

Jennison, Christopher, and Bruce W. Turnbull. 2000. Group Sequential Methods with Applications to Clinical Trials. Boca Raton, FL: Chapman; Hall/CRC.

Proschan, Michael A., K. K. Gordon Lan, and Janet Turk Wittes. 2006. Statistical Monitoring of Clinical Trials. A Unified Approach. New York, NY: Springer.

Tsiatis, Anastasios A. 1982. “Repeated Significance Testing for a General Class of Statistics Use in Censored Survival Analysis.” Journal of the American Statistical Association 77: 855–61.

Non-Proportional Effect Size in Group Sequential Design