On Biostatistics and Clinical Trials: Commonly Used Procedure for Multiplicity Adjustment: Fixed Sequence Procedure, Holm Step-down Procedure, Hochberg Step-up Procedure

In clinical trials, we often have the multiple tests or multiplicity issue when there are more than one hypothesis tests built in the same study and we want to claim the trial success if one of multiple hypothesis tests is significant. For example, in steoporosis/breast cancer trial, there may be two endpoints:

Endpoint 1: Incidence of vertebral fractures
Endpoint 2: Incidence of breast cancer

We would like to claim the success if at least one endpoint is significant. In a trial with a low dose group, a high dose group, and a placebo control, if we want to claim the success if either lower dose versus placebo or high dose group versus control is statistically significant. In both of these situations, adjustment for multiplicity must be employed.

On the other hand, not all studies with more than one hypothesis tests will need the adjustment for multiplicity. With Alzheimer’s disease trial as example, FDA guidance requires two endpoints

Endpoint 1: Cognition endpoint (ADAS-Cog)
Endpoint 2: Clinical global scale (CIBIC plus)

and requires that both endpoints must be significant in order to claim success. In this case, both hypotheses are tested at significant level of 0.05 and there is no adjustment for multiplicity is needed.

In late phase clinical trials, if multiplicity issue exists, adjustment for multiplicity must be built into the statistical analysis plan to avoid the inflation of the family-wise type 1 error rate (usually 0.05 or 5%).

Many different approaches have been proposed for handling the multiplicity issue. In a recent article by Wang et al (2015) “Overview of multiple testing methodology and recent development in clinical trials”, the following procedures were reviewed.

Multiple testing procedures for non-hierarchical hypotheses	Non-parametric or semi-parametric procedures	Bonferroni procedure
		Simes procedure
		Holm step-down procedure
		Hochberg step-up procedure
		Hommel procedure
	Parametric procedures	Dunnett procedure

Multiple testing procedures on hierarchical hypotheses	Simple procedures for hierarchical hypotheses	Fixed-sequence procedure
		Fallback procedure
	Gatekeeping procedures	Serial gatekeeping procedures
		Parallel gatekeeping procedure
		Other extensions of gatekeeping procedures

Graphical approaches

In a presentation by Bretz and Xun “introduction to multiplicity in clinical trials” at IMPACT meeting, the multiple testing procedures for non-hierarchical hypotheses were organized based on whether the test is a single step or stepwise and based on whether or not the correlations are considered.

They also made the following remarks:

· Single step methods are less powerful than stepwise methods and not often used in practice

· Accounting for correlations leads to more powerful procedures, but correlations are not always known

· Simes-based methods are more powerful than Bonferroni-based methods, but control the FWER only under certain dependence structures

· In practice, we select the procedure that is not only powerful from a statistical perspective, but also appropriate from clinical perspective

For a specific clinical trial with multiplicity issue, the choice of the procedure for multiplicity adjustment depends on the study design, if there is an order in clinical importance of multiple hypothesis tests, or sometimes if there is a prior evidence that one hypothesis test may be more likely to be significant. For example, for a dose-response study, Dunnett procedure or stepdown Dunnett procedure may be preferred. If Multiplicity problems in clinical trials have multiple sources of multiplicity (for example, multiple endpoints + different type of tests (superiority and non-inferiority)), then the gatekeeping procedure may be preferred.

In industry clinical trials, some procedures are more commonly used than others because they are more powerful or more likely to declare the statistical significance. It may usually be the case that the clinical trial sponsor side (the pharmaceutical/biotech companies) would like to choose a procedure that is more powerful (such as Hochberg procedure) while the regulatory side (such as FDA) would prefer a procedure that is more conservative (such as Bonferroni or Holm’s procedure).

We are still waiting for FDA to issue its formal guidance on multiplicity issues. In the meantime, we see that some procedures for handling the multiplicity issue are mentioned in therapeutic area specific guidance or presentations by FDA statisticians. For example, in CDRH’s guidance “Clinical Investigations of Devices Indicated for the Treatment of Urinary Incontinence”, the following paragraph was mentioned in dealing with the multiplicity issue when performing the statistical tests for multiple secondary endpoints.

The primary statistical challenge in supporting the indication for use or device performance in the labeling is in making multiple assessments of the secondary endpoint data without increasing the type 1 error above an acceptable level (typically 5%). There are many valid multiplicity adjustment strategies available for use to maintain the type 1 error rate at or below the specified level, three of which are listed below:

Bonferroni procedure

Hierarchical closed test procedure

Holm’s step-down procedure

Because each of these multiplicity adjustment strategies involves balancing different potential advantages and disadvantages, we recommend you prospectively state the strategy that you intend to use. We recommend your protocol prospectively state a statistical hypothesis for each secondary endpoint related to the indication for use or device performance.

EMA has a guideline “Points to consider on multiplicity issues in clinical trials”. The document was issued in 2002 and might be time for revision. The document mainly focused on when the adjustment for multiplicity is needed and when the adjustment for multiplicity is not needed. There is no mention about the procedures that could be used for multiplicity adjustment.

A recent paper by Sakamaki et al (2016) “Current practice onmultiplicity adjustment and sample size calculation in multi-arm clinicaltrials: an industry survey in Japan” revealed that fixed sequence procedure, gatekeeping procedure, and Hochberg procedure are most commonly used and Holm procedure is rarely used.

Assuming that there are two hypothesis tests and the left column indicates the p-values for these two hypothesis tests. Claiming the statistical significance depending on which procedure to use for multiplicity adjustment. In this specific case, the Hochberg step-up procedure is more power than other multiplicity adjustment procedures.

	Without any adjustment for multiplicity	Bonferroni correction	Fixed sequence hierarchical	Holm stepdown procedure	Hochberg step-up Procedure
	Compare p₁ with 0.05 Compare p₂ with 0.05	If p₁ lt 0.025 or if p₂ lt 0.025	If p₁lt 0.05, comparing p₂ with 0.05; If p₁gt 0.05, p₂ will not be tested	If min(p₁, p₂) lt 0.025 Then test if max(p₁, p₂) lt 0.05	If max(p₁, p₂) lt 0.05 then claim both groups are successful; or if max(p₁, p₂) gt 0.05 then test if min(p₁,p₂) lt 0.025
p₁=0.04 p₂=0.03	✓	x	✓	x	✓
p₁gt 0.05 p₂=0.03	✓	x	x	x	x
p₁gt .05 p₂=0.02	✓	✓	x	✓	✓
p₁=0.04 p₂gt 0.05	✓	x	x	x	x
p₁=0.02 p₂₌0.02	✓	✓	✓	✓	✓

References:

Wang et al (2015) “Overview of multiple testing methodology and recent development in clinical trials”
Bretz and Xun (2014)“introduction to multiplicity in clinical trials”
Alex Dmitrienko (2013) “Multiple Testing Procedures in Clinical Trials”
Huque and R¨ohmel (2010) “Multiplicity Problems in Clinical Trials: A Regulatory Perspective”
Alex Dmitrienko and Brian Millen (2011) “Multiple testing methodology in the context of subgroup analysis”
Thomas Permutt (2013) Multiplicity in Regulatory Statistical Review"
Mohammad Huque et al (2013) Multiplicity Issues in Clinical Trials With Multiple Objectives

On Biostatistics and Clinical Trials

Sunday, December 11, 2016

Commonly Used Procedure for Multiplicity Adjustment: Fixed Sequence Procedure, Holm Step-down Procedure, Hochberg Step-up Procedure

No comments:

About Me

Promoting Statistical Insight