2 Your first design

In this chapter, you will quickly create a group sequential design and learn the basics of the web interface.

2.1 Overview

In this chapter you will quickly learn how to initiate the group sequential design web interface, edit input parameters specifying a group sequential design and view tables, text and plots summarizing the design. We also quickly demonstrate how to view, save and re-use code for your design. Finally, we show how to access background on the gsDesign web interface as well as demonstration videos.

2.1.1 Starting the web interface

The gsDesign web interface is currently hosted at https://gsdesign.shinyapps.io/prod/ on a host machine provided by Posit. If you go to this page using a web browser that is HTML5 compatible (most browsers if you are using the latest version), you are ready to go! You should be able to run this application on your computer, your tablet, or even your phone.

The interface will adapt to the device you use, as needed. One thing to note: if you leave your computer for a ‘long time’, your web page will partially gray-out. This means you need to reinitialize the web page; when this happens, parameters you have changed will be re-set to defaults. Thus, it is useful to work continuously on a design until you have finished and/or save your work.

Full-screen view of the gsDesign web interface. The gsDesign web interface is divided into input and output panels. Version information and controls for the input interface appear in the banner.

Figure 2.1: Full-screen view of the gsDesign web interface. The gsDesign web interface is divided into input and output panels. Version information and controls for the input interface appear in the banner.

2.1.2 Overall layout

The overall layout of the opening page of the web interface on a computer browser is seen in Image 2.1. Note that in the interface there is a “?” next to each control; if you hover over this, you get a brief description of the control.

There are five regions on this interface:

  1. In the top-left you see the branding gsDesign Explorer and four labels (Design, Update, Tutorial, About) that will take you to different interface pages.
  2. In the top-right, there is a control Display Settings that controls output format as well as controls to save and restore designs.
  3. The main left panel is for inputs.
  4. The main right panel is for displaying outputs.
  5. In the page footer you can see the application name and the version number.

The input panel on the left with tabs labelled “Sample size,” “Timing,” and “Boundaries” is used to control design parameters. The output panel has tabs labelled “Plot,” “Tabular,” “Text,” and “Report.” The output tabs allow you to view characteristics of a design in different ways. In the Report tab you can view and copy code that can be saved or used in R to reproduce a design; you can produce an html file with the report or save an R markdown file locally so you can edit and customize the report.

2.2 The input panel

  1. Sample size tab
  2. Timing tab
  3. Boundaries tab

2.2.1 Sample size tab

The sample size tab can be used to set design parameters for a fixed sample size design that are extended to a group sequential design in the timing and boundaries tabs. Design characteristics specified here are Type I error, which is always one-sided, design power and the endpoint type.

For your first design, leave the default values of 0.025 for 2.5% one-sided Type I error and 0.9 for 90% power.

Depending on the endpoint type selected different information is available for you to specify. For this first design, we begin by selecting the endpoint type “User”. You will see the following in your input panel:

Inputing user-specified fixed design sample size. Default input for user-defined endpoint type.

Figure 2.2: Inputing user-specified fixed design sample size. Default input for user-defined endpoint type.

Below “Endpoint type” you see three input controls. “Design type” allows you to select between “Superiority” and “Noninferiority” designs. We will come back to non-inferiority designs in . “Treatment effect” is a value of a parameter that quantifies the treatment benefit that you wish to power the trial for; i.e., a difference in blood pressure response to a control drug versus an experimental drug. Using this example, let’s say that we wish to make the sample size large enough so that if the experimental drug improves blood pressure, on average, by 5 mm more than the control drug we have a “strong chance” that a statistical test will demonstrate this. “Strong chance”, in this case, is set at 0.9 (or 90%) in the “Power” control. We also want to have a small chance if the experimental drug provides no better blood pressure lowering than control to have the same statistical test show a benefit; this is set to 0.025, (or 2.5%) in the “alpha (1-sided)” control. “alpha” is also known as Type I error, which you can see by pressing on the “?” to the right of “alpha (1-sided)”:

Most of the time you may have the interface compute a sample size for a fixed design for you. In this case, we assume you will use other software to find the sample size for a fixed design with no interim analyses. Assume this computation produces a sample size of 100. Enter this in the “N for fixed design control”. Your input panel should now appear as follows:

2.2.2 Timing tab

Now we are ready to add interim analyses by selecting the “Timing” tab. You can choose spacing between analyses that is “Equal” or “Unequal” in the “Interval spacing” control:

The default is equal spacing with two interim analyses. Selecting “Unequal” spacing. Lets opt for interim analyses after 35% and 70% of the data are available as shown in the next column. Since it often takes a while to collect data, more patients may enroll while you are preparing for an interim analysis. Thus, it is likely important to set timing so that analyses are performed before all patients have been enrolled.

2.2.3 Boundaries tab

Finally, we select the “Boundaries” tab and set both “Upper spending” and “Lower-spending” to “1-parameter” to obtain the panel displayed in the next column. There is a wide variety of options available in the “Boundaries” tab. Using the “Testing” control, you can select between bounds that are:

For now, we will stick with the default of “Asymmetric, nonbinding”, a good option for many trials where a new drug is compared to standard therapy or placebo control arm. For other options, see Chapter 6.

The “Upper spending” and “Lower spending” controls allow a wide variety of bounds to be selected for a design. The defaults shown here are reasonably conservative and a good starting point.

2.3 The output panel

  1. Initial view of the trial design
  2. Tabular summary of the design
  3. Text summary of the design
  4. Power plot
  5. Treatment effect plot
  6. Expected sample size plot

In this section we show selected output options that may be most useful to you.

2.3.1 Initial view of the trial design

After selecting the “Plot” output tab, your output panel should now show the plot above. This plot describes the timing and boundaries of the group sequential design you have just created. On the x-axis you can see that analyses are performed with sample sizes of 35, 75 and 107, respectively. Recall that the fixed design sample size was 100, so converting this to a group sequential design increased the sample size, in this case, by 7%. If you change the fixed design sample size to 1, you will see the group sequential sample size is 1.061. Whatever fixed design sample size you choose, the corresponding group sequential design with the given timing and boundaries will be 1.061 times the input sample size rounded up to the next integer.

On the y-axis you see “Normal critical value.” The group sequential designs created using the web interface all use the normal distribution for computing design properties. This has been shown to be a good approximation for many cases. Another term for a normal critical value is a \(Z\)-statistic, a term which we will use interchangeably with normal critical value in this book. If you are familiar with \(Z\)-statistics, you may recall that a value of 1.96 corresponds to a \(p\)-value of 0.05 (5%), 2-sided or 0.025 (2.5%), 1-sided. At the final analysis we see a \(Z\)-statistic labeled as 1.97, just slightly larger than this. Thus, if the trial proceeds to the final analysis, you will conclude that the experimental treatment is better than control if your final test statistic including all 107 patients enrolled is 1.97 or greater. In technical terms, you reject the null hypothesis of no treatment benefit of experimental treatment compared to control. If \(Z < 1.97\), you fail to reject the null hypothesis any treatment difference between the treatment groups. You may reach a conclusion earlier than this final analysis, which is a major advantage of a group sequential design. For instance, at the analysis of 35 patients, if \(Z \geq 3.65\) you can conclude that the new treatment is better than control at the first analysis. At this same analysis, if \(Z < -0.17\) you can stop the trial for futility, meaning that it is considered unlikely you would conclude experimental treatment better than control if you were to continue the trial.

2.3.2 Tabular summary of the design

The next figure shows a tabular summary of the design obtained by selecting the “Tabular” tab in the output panel. For each of the 2 interim analyses as well as the final analysis, timing information in the left-hand column (percent and number of patients to be included in the analysis). The 5 rows for each analysis show different characteristics of the efficacy and futility boundaries in the other columns:

  1. The \(Z\)-values (normal test statistic) required to cross the efficacy and futility bounds.
  2. Nominal, one-sided \(p\)-values; greater \(Z\) implies smaller nominal \(p\)-values for both the efficacy and futility bounds.
  3. “delta at bound” is an approximation of the observed treatment effect corresponding to the \(Z\)-value at each bound. This is also based on the “Treatment effect” you entered in the input panel; if you change 5 there to 50, you will see the treatment effect displayed here increase by a factor of 10. There is sometimes an expectation that in order to continue a trial past an interim analysis that there is a estimated treatment effect that is ‘promising.’ Also, expectations may include that stopping early for efficacy would correspond to a clinically compelling estimated treatment effect. In this case, crossing an early efficacy bound would require nearly twice the treatment effect that the trial is powered for (9.25 vs. 5) at interim 1 and more than the powered treatment effect at interim 2 (5.1 vs 5).
  4. The probability of crossing each bound under the null hypothesis (delta=0). These numbers are cumulative; i.e., the 0.0229 at the final analysis is the probability of crossing a futility bound at either of the interim analyses for the final analysis. The reason this is less than 0.025 (the study Type I error) is that the computation assumes the trial stops if the trial crosses a lower bound and cannot later cross an efficacy bound. Regulators generally prefer that you assume the trial will not stop if a futility bound is crossed and therefore an efficacy bound could later be crossed. This provides the sponsor of a trial the option of not obeying futility stopping rules and still controlling Type I error. If you change the “Testing” control in the “Boundaries” tab to “Asymmetric, binding,” this value will change to 0.025. You can see that there is a low probability of crossing an early efficacy bound at the interim analyses and most of the Type I error accumulates at the final analysis. The most important thing here is that there is often the expectation that there is a ‘reasonably high’ probability of crossing a futility bound when delta=0 at an interim analysis.
  5. The probability of crossing each bound under the alternate hypothesis. The most important information here is that there is a reasonably low probability of crossing the futility bound under the alternate hypothesis (\(\delta = 5\), in this case).

Note the textual summary underneath the table that describes the study design. This describes the maximum sample size, Type I error and power as well as the types of bounds and spending functions used.

2.3.3 Text summary of the design

The “Text” provides much of the same information that is in the “Tabular” tab. However, there is additional information that we explain in more detail in Chapter 7 when we discuss spending functions.

2.3.4 The power plot

If you select “Power” as the plot type in the output panel, you get the following plot:

The x-axis shows the underlying treatment effect. The solid black line shows study power by the underlying treatment effect. Thus, for \(\delta = 5\) we have 90% power, as specified; this is indicated by y-axis value of 0.9 for cumulative boundary crossing probability. This power includes the probability of crossing at either an interim or the final analysis (cumulative probability) and excludes the probability of crossing the futility bound at one analysis followed by later crossing the efficacy bound. The dotted black line shows the probability of crossing the efficacy bound at the first interim analysis, while the dashed black line shows the probability of crossing an efficacy bound at either the first or second interim. The red lines correspond to 1 minus the probability of crossing the futility bound at interim 1 (dotted line) or at or before the second interim (dashed line). Thus, the lines at any given value of \(\delta\) divide the possible ultimate outcomes of the trial into six mutually exclusive probabilities according to which boundary is crossed first: cross the futility bound at 1) interim 1, 2) interim 2 , 3) the final analysis, or cross the efficacy bound at the 4) interim 1, 5) interim 2, or 6) the final analysis. As \(\delta\) increases, we go from a relatively high probability of crossing a futility bound at an interim analysis to a high probability of crossing an efficacy bound either at an interim or at the final analysis.

2.3.5 The treatment effect plot

Selecting “Treatment effect” plot yields the following display:

As noted in the discussion of the tabular summary, these treatment effects are approximations based on the Z-value at each bound and the treatment effect you specified. This is a nice way to review the ‘clinical significance’ of crossing or not crossing any bound. Note that crossing the final efficacy bound corresponds approximately to an observed treatment effect of 2.94 as opposed the treatment effect of 5 that we powered the trial for. Checking this value may be a consideration in choosing the \(\delta\)-value you want to power the trial for.

2.3.6 The expected sample size plot

The “Expected sample size” plot shows the average number of patients analyzed when the study crosses a bound at either an interim or the final analysis.

This expected sample size value is largest when the underlying treatment effect is between 2.5 and 5, and lower for more extreme values where it is more likely to cross a study bound at an interim analysis. Note that the expected value does not include patients enrolled but not included at the time an interim analysis is performed.

2.3.7 Reproducible reports

The Report tab allows you to produce a report summarizing your design in one of two ways. Hit the Report tab and scroll to the bottom of the window to see the following buttons:

The first button will download an R markdown file with a mixture of text and code that can be used from the Posit desktop interface to produce a report. This source file enables you to add text and further code, as needed, to fully explain the design to an intended audience or just for documenting for your future self what you were thinking about when you created the design. If you are running R on a server, you may have the gsDesign package needed to publish the saved report already installed. If not, you can install the R package gsDesign from CRAN. The next figure shows the code generated for the current design. The R code chunk beginning with x <- generates the design and saves it in a variable named x. The “plot” command generates whichever plot you have selected from the “Plot” tab, while gsBoundSummary() and cat(summary(x)) provide the tabular and textual summaries you have seen.

The second button creates an html file by running the R markdown report and downloading for you. Pushing either of these buttons enables you review and copy R code so you can use in R as you please.

2.4 Saving and restoring your design

Sometimes, we want to continue our previous work in the web interface without setting up all the design parameters again. Fortunately, there is a feature for saving and restoring your design. The following tabs enable you to save, restore and report your work.

You can save the design by pressing the Save Design button in the upper-right corner. Give the saved design file a name and download the file. This step will capture the current state of the entire web interface (including all inputs) and store it in an .rds file. You can then restore the saved design from the .rds file by uploading it using the Restore Design button. The R serialized data format (.rds) files offer an efficient way for the persistent storage, transfer, and exchange of design checkpoints, enabling a reproducible and productive workflow when using the web interface.

2.5 The Update tab

The update tab allows you to update study bounds at the time of interim analysis based on differences from planned versus actual events for studies with a time-to-event endpoint or differences in planned versus actual sample size or statistical information for other endpoints. We continue with the first design example to demonstrate. If you do not have the first design example still available on your screen, you can download this .rds file, press the Restore Design button in the upper-right corner of the interface to restore the design. Then, switch to the Update tab in middle of the top row of the interface and you should see the following screen. If you scroll down, you will see the following

You will note in the lower-left-hand corner of the input screen labelled Bound Update Parameters that rather than integer sample sizes you have fractional numbers as this is what has been derived in the design. For the 3 analyses, enter 35, 70, and 110 in the boxes labelled N at analysis 1, 2, 3. Now switch to the Report tab and, at the bottom of the screen, press the Download Report (HTML) button. If you open the resulting HTML file in your browser and scroll down, you will come to the following.

If you come to the end of the trial and your data cutoff left you with less than the final planned sample size, the first bullet indicates that you would not spend the full \(\alpha\) for testing; to avoid this, you can change the Final analysis alpha spending strategy strategy to Full alpha. If you have more than the final planned sample size at the final analysis, it does not matter which option you choose for this. The second bullet under Update bounds at time of analysis says that you will use the default of the information fraction for spending at the interim analysis. The other option will be discussed later when we present further examples.

Finally, we look at the updated bounds displayed on the screen. For the original design, it appeared that we had integer sample sizes while we just saw that this was not the case when sample size was shown with more accuracy on the update design screen. Now, we have specified integer sample sizes for the updated design and bounds have changed slightly from the original bounds based on the input spending function and following the methods of Gordon Lan and DeMets6.

2.6 The Tutorial and About tabs

As of this writing, the “Tutorial” tab contains videos summarizing how to use this web interface. The first one is a useful 10-minute summary for getting started with the interface.

2.7 Summary

The output summaries shown in this chapter are likely the ones that you will find most useful. We will demonstrate other options in later chapters where additional issues are addressed. You now have a basic understanding of many of the input and output controls. We proceed to working with different endpoint types in the next two chapters.