Group Sequential Design Under Non-Proportional Hazards

total number of looks(K)	\(\alpha = 0.01\)	\(\alpha = 0.05\)	\(\alpha = 0.1\)
1	2.576	1.960	1.645
2	2.772	2.178	1.875
4	2.939	2.361	2.067
8	3.078	2.512	2.225
\(\infty\)	\(\infty\)	\(\infty\)	\(\infty\)

total number of looks(K)	\(\alpha = 0.01\)	\(\alpha = 0.05\)	\(\alpha = 0.1\)
1	2.576	1.960	1.645
2	2.580	1.977	1.678
4	2.609	2.024	1.733
8	2.648	2.072	1.786
16	2.684	2.114	1.830
\(\infty\)	2.807	2.241	1.960

Procedure name	Boundary	Advantages	Disadvantages
Haybittle-Peto	K-1 at interim analyses and 1.96 at the final analysis	simple to implement
Pocock	a constant decision boundary for Z-score		(1) requires the same level of evidence for early and late looks at the data, so it pays larger price for the final analysis ; (2) requires equally spaced looks
O’Brien-Fleming	constant B-value boundaries, steep decrease in Z-boundaries	pay smaller price for the final analysis	too conservative in the early stages?

Generating simulated datasets
- simfix(): Simulation of fixed sample size design for time-to-event endpoint
- simfix2simPWSurv(): Conversion of enrollment and failure rates from simfix() to simPWSurv() format
- simPWSurv(): Simulate a stratified time-to-event outcome randomized trial
Cutting data for analysis
- cutData(): Cut a dataset for analysis at a specified date
- cutDataAtCount(): Cut a dataset for analysis at a specified event count
- getCutDateForCount(): Get date at which an event count is reached
Analysis functions
- tenFH(): Fleming-Harrington weighted logrank tests
- tenFHcorr(): Fleming-Harrington weighted logrank tests plus correlations
- tensurv(): Process survival data into counting process format
- pMaxCombo(): MaxCombo p-value
- pwexpfit(): Piecewise exponential survival estimation
- wMB(): Magirr and Burman modestly weighted logrank tests
Lower level functions
- fixedBlockRand(): Permuted fixed block randomization
- rpwenroll(): Generate piecewise exponential enrollment
- rpwexp(): The piecewise exponential distribution

duration	rate
0.333	0.541
0.333	3.736
0.333	4.223
1.000	2.716
1.000	1.619

Interval	HR	-ln(HR)	Expected Events
0-4	1.0	0.00	d1
>4	0.6	0.51	d2

Time	AHR	Events	info	info0
12.00	0.84	107.39	26.37	26.85
20.00	0.74	207.90	50.67	51.97
28.00	0.70	279.10	68.23	69.78
36.00	0.68	331.29	81.38	82.82

AHR	Events	info	info0
0.84	107.16	25.48	26.79
0.74	207.77	49.74	51.94
0.70	278.93	67.29	69.73
0.68	331.13	80.11	82.78

Stratum	duration	rate
All	12	41.7

Stratum	duration	failRate	hr	dropoutRate
All	4	0.046	1.0	0.001
All	100	0.046	0.6	0.001

Time	AHR	Events	info	info0
12	0.84	107.4	26.4	26.8
20	0.74	207.9	50.7	52.0
28	0.70	279.1	68.2	69.8
36	0.68	331.3	81.4	82.8

Analysis	Value	Efficacy
IA 1: 32%	Z	3.7670
N: 444	p (1-sided)	0.0001
Events: 97	~HR at bound	0.4636
Month: 13	P(Cross) if HR=1	0.0001
	P(Cross) if HR=0.68	0.0289
IA 2: 63%	Z	2.6020
N: 444	p (1-sided)	0.0046
Events: 186	~HR at bound	0.6828
Month: 21	P(Cross) if HR=1	0.0047
	P(Cross) if HR=0.68	0.4999
IA 3: 84%	Z	2.2209
N: 444	p (1-sided)	0.0132
Events: 250	~HR at bound	0.7549
Month: 28	P(Cross) if HR=1	0.0146
	P(Cross) if HR=0.68	0.7916
Final	Z	2.0453
N: 444	p (1-sided)	0.0204
Events: 297	~HR at bound	0.7885
Month: 36	P(Cross) if HR=1	0.0250
	P(Cross) if HR=0.68	0.9000

Analysis	Bound	Time	N	Events	Z	Probability	AHR	theta	info	info0
1	Upper	12	464.3	99.7	3.7670	0.0019	0.840	0.175	24.5	24.9
2	Upper	20	464.3	193.0	2.6020	0.3024	0.738	0.304	47.0	48.3
3	Upper	28	464.3	259.2	2.2209	0.7329	0.700	0.357	63.3	64.8
4	Upper	36	464.3	307.6	2.0453	0.9000	0.683	0.381	75.6	76.9

Analysis	Value	Efficacy	Futility
IA 1: 32%	Z	3.7670	-3.7670
N: 444	p (1-sided)	0.0001	0.0001
Events: 97	~HR at bound	0.4636	2.1569
Month: 13	P(Cross) if HR=1	0.0001	0.0001
	P(Cross) if HR=0.68	0.0289	0.0000
IA 2: 63%	Z	2.6020	-2.6020
N: 444	p (1-sided)	0.0046	0.0046
Events: 186	~HR at bound	0.6828	1.4646
Month: 21	P(Cross) if HR=1	0.0047	0.0047
	P(Cross) if HR=0.68	0.4999	0.0000
IA 3: 84%	Z	2.2209	-2.2209
N: 444	p (1-sided)	0.0132	0.0132
Events: 250	~HR at bound	0.7549	1.3246
Month: 28	P(Cross) if HR=1	0.0146	0.0146
	P(Cross) if HR=0.68	0.7916	0.0000
Final	Z	2.0453	-2.0453
N: 444	p (1-sided)	0.0204	0.0204
Events: 297	~HR at bound	0.7885	1.2682
Month: 36	P(Cross) if HR=1	0.0250	0.0250
	P(Cross) if HR=0.68	0.9000	0.0000

Analysis	Bound	Time	N	Events	Z	Probability	AHR	theta	info	info0
1	Upper	12	464.3	99.7	3.7670	0.0019	0.840	0.175	24.5	24.9
2	Upper	20	464.3	193.0	2.6020	0.3024	0.738	0.304	47.0	48.3
3	Upper	28	464.3	259.2	2.2209	0.7329	0.700	0.357	63.3	64.8
4	Upper	36	464.3	307.6	2.0453	0.9000	0.683	0.381	75.6	76.9
1	Lower	12	464.3	99.7	−3.7670	0.0000	0.840	0.175	24.5	24.9
2	Lower	20	464.3	193.0	−2.6020	0.0000	0.738	0.304	47.0	48.3
3	Lower	28	464.3	259.2	−2.2209	0.0000	0.700	0.357	63.3	64.8
4	Lower	36	464.3	307.6	−2.0453	0.0000	0.683	0.381	75.6	76.9

Analysis	Value	Efficacy	Futility
IA 1: 32%	Z	3.7670	-0.2503
N: 476	p (1-sided)	0.0001	0.5988
Events: 104	~HR at bound	0.4767	1.0505
Month: 13	P(Cross) if HR=1	0.0001	0.4012
	P(Cross) if HR=0.68	0.0338	0.0143
IA 2: 63%	Z	2.6020	0.8440
N: 476	p (1-sided)	0.0046	0.1993
Events: 201	~HR at bound	0.6922	0.8875
Month: 21	P(Cross) if HR=1	0.0047	0.8103
	P(Cross) if HR=0.68	0.5385	0.0393
IA 3: 84%	Z	2.2209	1.5151
N: 476	p (1-sided)	0.0132	0.0649
Events: 269	~HR at bound	0.7626	0.8312
Month: 28	P(Cross) if HR=1	0.0144	0.9414
	P(Cross) if HR=0.68	0.8185	0.0687
Final	Z	2.0453	2.0453
N: 476	p (1-sided)	0.0204	0.0204
Events: 319	~HR at bound	0.7953	0.7953
Month: 36	P(Cross) if HR=1	0.0225	0.9775
	P(Cross) if HR=0.68	0.9000	0.1000

Analysis	Bound	Time	N	Events	Z	Probability	AHR	theta	info	info0
1	Upper	12	501.8	107.8	3.7670	0.0021	0.840	0.175	26.5	26.9
2	Upper	20	501.8	208.6	2.6020	0.3318	0.738	0.304	50.9	52.2
3	Upper	28	501.8	280.1	2.2209	0.7660	0.700	0.357	68.5	70.0
4	Upper	36	501.8	332.5	2.0453	0.9000	0.683	0.381	81.7	83.1
1	Lower	12	501.8	107.8	−1.2899	0.0143	0.840	0.175	26.5	26.9
2	Lower	20	501.8	208.6	0.3054	0.0387	0.738	0.304	50.9	52.2
3	Lower	28	501.8	280.1	1.3340	0.0681	0.700	0.357	68.5	70.0
4	Lower	36	501.8	332.5	2.0453	0.1000	0.683	0.381	81.7	83.1

Analysis	Bound	Time	N	Events	Z	Probability	AHR	theta	info	info0
1	Lower	12	467.6	100.4	−1.6449	0.0060	0.840	0.175	24.7	25.1
2	Upper	20	467.6	194.4	2.5999	0.3057	0.738	0.304	47.4	48.6
3	Upper	28	467.6	261.0	2.2207	0.7359	0.700	0.357	63.8	65.3
4	Upper	36	467.6	309.8	2.0452	0.9000	0.683	0.381	76.1	77.5

[1] Scharfstein, D. O., Tsiatis, A. A. and Robins, J. M. (1997). Semiparametric efficiency and its implication on the design and analysis of group-sequential studies. Journal of the American Statistical Association 92 1342–50.

[2] Jennison, C. and Turnbull, B. W. (2000). Group sequential methods with applications to clinical trials. Chapman; Hall/CRC, Boca Raton, FL.

[3] Lachin, J. M. and Foulkes, M. A. (1986). Evaluation of sample size and power for analyses of survival with allowance for nonuniform patient entry, losses to follow-up, noncompliance, and stratification. Biometrics 42 507–19.

[4] Schoenfeld, D. (1981). The asymptotic properties of nonparametric tests for comparing survival distributions. Biometrika 68 316–9.

[5] Mukhopadhyay, P., Huang, W., Metcalfe, P., Öhrn, F., Jenner, M. and Stone, A. (2020). Statistical and practical considerations in designing of immuno-oncology trials. Journal of Biopharmaceutical Statistics 1–7.

[6] Yung, G. and Liu, Y. (2020). Sample size and power for the weighted log-rank test and kaplan-meier based tests with allowance for nonproportional hazards. Biometrics 76 939–50.

[7] Karrison, T. G. and others. (2016). Versatile tests for comparing survival curves based on weighted log-rank statistics. Stata Journal 16 678–90.

[8] Kim, K. and Tsiatis, A. A. (1990). Study duration for clinical trials with survival response and early stopping rule. Biometrics 81–92.

[9] Lan, K. K. G. and DeMets, D. L. (1983). Discrete sequential boundaries for clinical trials. Biometrika 70 659–63.

[10] Haybittle, J. (1971). Repeated assessment of results in clinical trials of cancer treatment. The British Journal of Radiology 44 793–7.

[11] Peto, R., Pike, Mc., Armitage, P., Breslow, N. E., Cox, D., Howard, S., Mantel, N., McPherson, K., Peto, J. and Smith, P. (1977). Design and analysis of randomized clinical trials requiring prolonged observation of each patient. II. Analysis and examples. British Journal of Cancer 35 1–39.

[12] Wang, S. K. and Tsiatis, A. A. (1987). Approximately optimal one-parameter boundaries for group sequential trials. Biometrics 193–9.

[13] Pocock, S. J. (1977). Group sequential methods in the design and analysis of clinical trials. Biometrika 64 191–9.

[14] O’Brien, P. C. and Fleming, T. R. (1979). A multiple testing procedure for clinical trials. Biometrics 549–56.

[15] Slud, E. and Wei, L. (1982). Two-sample repeated significance tests based on the modified wilcoxon statistic. Journal of the American Statistical Association 77 862–8.

[16] Fleming, T. R., Harrington, D. P. and O’Brien, P. C. (1984). Designs for group sequential tests. Controlled Clinical Trials 5 348–61.

[17] Lan, K. K. G. and DeMets, D. L. (1989). Group sequential procedures: Calendar versus information time. Statistics in Medicine 8 1191–8.

[18] Gandhi, L., Rodrı́guez-Abreu, D., Gadgeel, S., Esteban, E., Felip, E., De Angelis, F., Domine, M., Clingan, P., Hochmair, M. J., Powell, S. F. and others. (2018). Pembrolizumab plus chemotherapy in metastatic non–small-cell lung cancer. New England Journal of Medicine 378 2078–92.

[19] Maurer, W. and Bretz, F. (2013). Multiple testing in group sequential trials using graphical approaches. Statistics in Biopharmaceutical Research 5 311–20.

[20] Downs, J. R., Beere, P. A., Whitney, E., Clearfield, M., Weis, S., Rochen, J., Stein, E. A., Shapiro, D. R., Langendorfer, A. and Gotto Jr, A. M. (1997). Design & rationale of the air force/texas coronary atherosclerosis prevention study (AFCAPS/TexCAPS). The American Journal of Cardiology 80 287–93.

[21] Downs, J. R., Clearfield, M., Weis, S., Whitney, E., Shapiro, D. R., Beere, P. A., Langendorfer, A., Stein, E. A., Kruyer, W., Gotto Jr, A. M. and others. (1998). Primary prevention of acute coronary events with lovastatin in men and women with average cholesterol levels: Results of AFCAPS/TexCAPS. Journal of the American Medical Association 279 1615–22.

[22] Hwang, I. K., Shih, W. J. and De Cani, J. S. (1990). Group sequential designs using a family of type i error probability spending functions. Statistics in Medicine 9 1439–45.

[23] White, W. B., Cannon, C. P., Heller, S. R., Nissen, S. E., Bergenstal, R. M., Bakris, G. L., Perez, A. T., Fleck, P. R., Mehta, C. R., Kupfer, S. and others. (2013). Alogliptin after acute coronary syndrome in patients with type 2 diabetes. New England Journal of Medicine 369 1327–35.

[24] White, W. B., Bakris, G. L., Bergenstal, R. M., Cannon, C. P., Cushman, W. C., Fleck, P., Heller, S., Mehta, C., Nissen, S. E., Perez, A. and others. (2011). EXamination of cArdiovascular outcoMes with alogliptIN versus standard of carE in patients with type 2 diabetes mellitus and acute coronary syndrome (EXAMINE): A cardiovascular safety study of the dipeptidyl peptidase 4 inhibitor alogliptin in patients with type 2 diabetes with acute coronary syndrome. American Heart Journal 162 620–6.

[25] Cohen, E. E., Soulières, D., Le Tourneau, C., Dinis, J., Licitra, L., Ahn, M.-J., Soria, A., Machiels, J.-P., Mach, N., Mehra, R. and others. (2019). Pembrolizumab versus methotrexate, docetaxel, or cetuximab for recurrent or metastatic head-and-neck squamous cell carcinoma (KEYNOTE-040): A randomised, open-label, phase 3 study. The Lancet 393 156–67.

[26] Miettinen, T. A., Pyörälä, K., Olsson, A. G., Musliner, T. A., Cook, T. J., Faergeman, O., Berg, K., Pedersen, T., Kjekshus, J. and Group, for the S. S. S. (1997). Cholesterol-lowering therapy in women and elderly patients with myocardial infarction or angina pectoris: Findings from the scandinavian simvastatin survival study (4S). Circulation 96 4211–8.

[27] Shitara, K., Özgüroğlu, M., Bang, Y.-J., Di Bartolomeo, M., Mandalà, M., Ryu, M.-H., Fornaro, L., Olesiński, T., Caglevic, C., Chung, H. C. and others. (2018). Pembrolizumab versus paclitaxel for previously treated, advanced gastric or gastro-oesophageal junction cancer (KEYNOTE-061): A randomised, open-label, controlled, phase 3 trial. The Lancet 392 123–33.

[28] Hernán, M. A. (2010). The hazards of hazard ratios. Epidemiology 21 13.

[29] Tsiatis, A. A. (1982). Repeated significance testing for a general class of statistics use in censored survival analysis. Journal of the American Statistical Association 77 855–61.

[30] Yung, G. and Liu, Y. (2019). Sample size and power for the weighted log-rank test and kaplan-meier based tests with allowance for nonproportional hazards. Biometrics.

Welcome

Outline

Disclaimer

Software

Course resources

Background

Group sequential design

Independent increments process - group sequential design

Assumptions for time-to-event endpoints

Testing bounds

Boundary crossing probabilities

Boundary crossing probabilities (cont.)

Symmetric bounds

Futility bounds

Asymmetric 2-sided testing

Sample size

Boundary types

Boundary families

Boundary family - Haybittle-Peto boundary

Boundary family - Haybittle-Peto (cont’d)

Boundary family - Wang-Tsiatis bounds

Wang-Tsiatis example - Pocock boundary

Wang-Tsiatis example - Pocock boundary (cont’d)

Wang-Tsiatis example - O’Brien-Fleming boundary

Wang-Tsiatis example - O’Brien-Fleming boundary (cont’d)

Boundary families - summary

Boundary families - summary (cont’d)

Spending function boundaries

Spending function

Lan-DeMets spending functions to approximate boundary families

Hwang-Shih-DeCani (gamma) spending functions

What is spending time?

General methods and proportional hazards

Proportional hazards approach

Lachin and Foulkes (1986) method

Shiny app for proportional hazards

Metastatic oncology example

Metastatic oncology: OS design approximation (continued)

Cardiovascular outcomes reduction

Cardiovascular outcomes: key parameters

Cardiovascular outcomes non-inferiority: EXAMINE trial

EXAMINE trial: Key assumptions

Cure model

Survival model assumptions

Cure model: Expected event accumulation over time

Expected event accrual over time

Potential advantages/disadvantages of calendar spending

Exercise

Average hazard ratio approach

Delayed effect

Delayed effect not just in oncology

Proportional hazards

Crossing survival curves

Summary of issues

Software installation

simtrial overview

simtrial functions

simtrial: reverse engineered datasets

gsDesign2

gsdmvn

The piecewise model

Introducing the piecewise model

Piecewise constant enrollment

Question

Failure rates

Piecewise exponential approximation

Approximating using piecewise model

Break (10 minutes)

Average hazard ratio

Overview

Average hazard ratio (AHR)

Approach to asymptotic theory

Log hazard ratio estimate

Inverse variance weighted \(\beta\)

AHR over time

Power by AHR

Power by AHR

AHR as estimand

Expected accrual of endpoints

AHR asymptotic approximation