Thursday, December 22, 2016

Control for Type I Error (or Adjustment for Multiplicity) for Secondary Endpoints

In clinical trial protocols, we usually specify one or more primary efficacy endpoints, then a list of secondary efficacy endpoints, and then more tertiary endpoints or exploratory endpoints. It is pretty standard that the primary efficacy endpoints are those the hypothesis testing or inferential statistics and sample size estimation are based on. If we have more than one primary efficacy endpoints, we will need to adjust for multiple tests or multiplicity. It is also clear that the tertiary or exploratory endpoints are for hypothesis generating or more bluntly for fishing expedition.  

Questions we are asked very often are:
  • For secondary efficacy endpoints, do we need to do formal hypothesis testing?
  • If so, do we need to adjust for multiple tests and multiplicity?
  • What is the purpose of controlling the type I error for secondary efficacy endpoints?  

While controlling Family Wise Error Rate (FWER) for primary efficacy endpoints (if multiple test situation exists) is necessary, controlling FWER for secondary efficacy endpoints is often questioned.

Controlling the FWER for secondary efficacy endpoints is valuable or is necessary if we are planning to include the secondary efficacy endpoints in the product label if the product is approved. In other words, if we would like to make the claim of the benefits based on the secondary efficacy endpoints. It is advisable to perform the formal hypothesis testing with controlling for FWER.

In FDA’s guidance for industry “Clinical Studies Section of Labeling for Human Prescription Drug and Biological Products — Content and Format”, there is a statement about the primary and secondary endpoints:
§   Primary and Secondary Endpoints: The terms primary endpoint and secondary endpoint are used so variably that they are rarely helpful. The appropriate inquiry is whether there is a well-documented, statistically and clinically meaningful effect on a prospectively defined endpoint, not whether the endpoint was identified as primary or secondary.
FDA does not care whether or not a study endpoint is called primary or secondary. However, if the information from these endpoints are used for supporting evidence and for product label, the endpoints need to be predefined and tests for these endpoints need to be controlled for the overall alpha or overall type I error.

Even though FDA does not care about the terminology of primary/secondary endpoints, it is still very common in practice that the clinical trials (especially the industry-sponsored clinical trials) specify primary, secondary, and exploratory endpoints.

In a presentation by Kathleen Fritsch from FDA “Multiplicity Issues in FDA-Reviewed Clinical Trials”, the difference between primary, secondary, and exploratory efficacy endpoints are clearly specified and it is stated that the secondary efficacy endpoints may be included in the product label if multiplicity issue is addressed.



It is not needed or rarely needed to control for FWER for exploratory efficacy endpoints.

In EMA’s guidance “Points to Consider on Multiplicity Issues in Clinical Trials”, adjustment for multiplicity is explicitly required if the secondary variables are used for additional claims.





A slide presentation regarding EMA guidance further explained the secondary efficacy endpoints for claims.

In FDA’s guidance on “Clinical Investigations of Devices Indicated for the Treatment of Urinary Incontinence”, the secondary endpoints were called out:

                Secondary Endpoints
      FDA believes secondary endpoint measures, by themselves, are not sufficient to characterize fully the treatment benefit. However, these measures may provide additional characterization of the treatment effect. Specifically, secondary endpoints can:

  • supply background and understanding of the primary endpoints, in terms of overall direction and strength of the treatment effect;
  • be the individual components of a composite primary endpoint, if used;
  • include variables for which the study is underpowered to definitively assess;
  • aid in the understanding of the treatment’s mechanism of action;
  • be associated with relevant sub-hypotheses (separate from the major objective of the treatment); or
  • be used to perform exploratory analyses.

       Assuming that the primary safety and effectiveness endpoints of the study are successfully met, we recommend you analyze the secondary endpoints to provide supportive evidence concerning the safety and effectiveness of the device, as well as to support descriptions of device performance in the labeling. To minimize bias, your protocol should prospectively identify all secondary endpoints, indicating how the data will be analyzed and what success criteria will be applied.
 Secondary Endpoint Analyses
We recommend your protocol prospectively define the statistical plan for performing secondary endpoint analyses in the event that the primary endpoint analysis has been successfully met. If the secondary endpoint analyses are intended purely as exploratory analyses, or are not intended to support the indication for use or device performance, we recommend you submit only simple descriptions of the analyses. If, on the other hand, any of the secondary endpoint analyses are intended to support the indication for use or the performance of your device in the labeling (e.g., comparing treatment and control groups using p-values or confidence intervals), we recommend you pre-specify this intention in your study protocol and describe in detail the statistical methods you plan to follow.
In summary, if we don’t perform the hypothesis testing for these secondary endpoints or not having appropriate control for overall type I error, we will lose the chance to claim the benefit on the secondary endpoints and to list secondary endpoints into the product label. Therefore, it is always a wise decision to perform the hypothesis testing for secondary endpoints with appropriate control for type I error. 

Tuesday, December 20, 2016

the 21st Century Cures Act and Innovations in Clinical Trials

After the bill passed through house and senator, the president Obama signed the bill into law on December 13, 2016. The 21st Century Cures Act is now officially a law in effect.

If you have trouble to find the final version of the cures act, Here is the one signed into the law by the President Obama.

H.R.34 - 21st Century Cures Act

The NPR has a good summary of "who wins and who loses with the 21st Century Cures Act".
The 21st Century Cures Act will greatly benefit the NIH and NCI. Here is a latest article published in New England Journal of Medicine regarding NIH's perspectives on the Act.

The 21st Century Cures Act — A View from the NIH


Some sections in this act are very relevant to design and analysis of clinical trials. The section 3021 calls out the adaptive design and other novel clinical trial designs. 

SEC. 3021. Novel clinical trial designs.(a) Proposals for use of novel clinical trial designs for drugs and biological products.—For purposes of assisting sponsors in incorporating complex adaptive and other novel trial designs into proposed clinical protocols and applications for new drugs under section 505 of the Federal Food, Drug, and Cosmetic Act (21 U.S.C. 355) and biological products under section 351 of the Public Health Service Act (42 U.S.C. 262), the Secretary of Health and Human Services (referred to in this section as the “Secretary”) shall conduct a public meeting and issue guidance in accordance with subsection (b).
(b) Guidance addressing use of novel clinical trial designs.—
(1) IN GENERAL.—The Secretary, acting through the Commissioner of Food and Drugs, shall update or issue guidance addressing the use of complex adaptive and other novel trial design in the development and regulatory review and approval or licensure for drugs and biological products.
(2) CONTENTS.—The guidance under paragraph (1) shall address—
(A) the use of complex adaptive and other novel trial designs, including how such clinical trials proposed or submitted help to satisfy the substantial evidence standard under section 505(d) of the Federal Food, Drug, and Cosmetic Act (21 U.S.C. 355(d));
(B) how sponsors may obtain feedback from the Secretary on technical issues related to modeling and simulations prior to—
(i) completion of such modeling or simulations; or
(ii) the submission of resulting information to the Secretary;
(C) the types of quantitative and qualitative information that should be submitted for review; and
(D) recommended analysis methodologies.
(3) PUBLIC MEETING.—Prior to updating or issuing the guidance required by paragraph (1), the Secretary shall consult with stakeholders, including representatives of regulated industry, academia, patient advocacy organizations, consumer groups, and disease research foundations, through a public meeting to be held not later than 18 months after the date of enactment of this Act.
(4) TIMING.—The Secretary shall update or issue a draft version of the guidance required by paragraph (1) not later than 18 months after the date of the public meeting required by paragraph (3) and finalize such guidance not later than 1 year after the date on which the public comment period for the draft guidance closes.




















The act specifies that the real world evidence can be used to support the drug approval where Real world evidence means data regarding the usage, or the potential benefits or risks, of a drug derived from sources other than randomized clinical trials.

The act creates a new pathway for medical device - breakthrough medical device - a similar pathway for breakthrough drug. 

The act specified the importance of the biomarker in drug approval process where biomarker “(A) means a characteristic (such as a physiologic, pathologic, or anatomic characteristic or measurement) that is objectively measured and evaluated as an indicator of normal biologic processes, pathologic processes, or biological responses to a therapeutic intervention; and


“(B) includes a surrogate endpoint.

The act contains a section for "targeted drugs for rare diseases"

SEC. 3012. Targeted drugs for rare diseases.Subchapter B of chapter V of the Federal Food, Drug, and Cosmetic Act (21 U.S.C. 360aa et seq.) is amended by inserting after section 529 the following:
“SEC. 529A. Targeted drugs for rare diseases.“(a) Purpose.—The purpose of this section, through the approach provided for in subsection (b), is to—
“(1) facilitate the development, review, and approval of genetically targeted drugs and variant protein targeted drugs to address an unmet medical need in one or more patient subgroups, including subgroups of patients with different mutations of a gene, with respect to rare diseases or conditions that are serious or life-threatening; and
“(2) maximize the use of scientific tools or methods, including surrogate endpoints and other biomarkers, for such purposes.
“(b) Leveraging of data from previously approved drug application or applications.—The Secretary may, consistent with applicable standards for approval under this Act or section 351(a) of the Public Health Service Act, allow the sponsor of an application under section 505(b)(1) of this Act or section 351(a) of the Public Health Service Act for a genetically targeted drug or a variant protein targeted drug to rely upon data and information—
“(1) previously developed by the same sponsor (or another sponsor that has provided the sponsor with a contractual right of reference to such data and information); and
“(2) submitted by a sponsor described in paragraph (1) in support of one or more previously approved applications that were submitted under section 505(b)(1) of this Act or section 351(a) of the Public Health Service Act,
for a drug that incorporates or utilizes the same or similar genetically targeted technology as the drug or drugs that are the subject of an application or applications described in paragraph (2) or for a variant protein targeted drug that is the same or incorporates or utilizes the same variant protein targeted drug, as the drug or drugs that are the subject of an application or applications described in paragraph (2).“(c) Definitions.—For purposes of this section—
“(1) the term ‘genetically targeted drug’ means a drug that—
“(A) is the subject of an application under section 505(b)(1) of this Act or section 351(a) of the Public Health Service Act for the treatment of a rare disease or condition (as such term is defined in section 526) that is serious or life-threatening;
“(B) may result in the modulation (including suppression, up-regulation, or activation) of the function of a gene or its associated gene product; and
“(C) incorporates or utilizes a genetically targeted technology;
“(2) the term ‘genetically targeted technology’ means a technology comprising non-replicating nucleic acid or analogous compounds with a common or similar chemistry that is intended to treat one or more patient subgroups, including subgroups of patients with different mutations of a gene, with the same disease or condition, including a disease or condition due to other variants in the same gene; and
“(3) the term ‘variant protein targeted drug’ means a drug that—
“(A) is the subject of an application under section 505(b)(1) of this Act or section 351(a) of the Public Health Service Act for the treatment of a rare disease or condition (as such term is defined in section 526) that is serious or life-threatening;
“(B) modulates the function of a product of a mutated gene where such mutation is responsible in whole or in part for a given disease or condition; and
“(C) is intended to treat one or more patient subgroups, including subgroups of patients with different mutations of a gene, with the same disease or condition.
“(d) Rule of construction.—Nothing in this section shall be construed to—
“(1) alter the authority of the Secretary to approve drugs pursuant to this Act or section 351 of the Public Health Service Act (as authorized prior to the date of enactment of the 21st Century Cures Act), including the standards of evidence, and applicable conditions, for approval under such applicable Act; or
“(2) confer any new rights, beyond those authorized under this Act or the Public Health Service Act prior to enactment of this section, with respect to the permissibility of a sponsor referencing information contained in another application submitted under section 505(b)(1) of this Act or section 351(a) of the Public Health Service Act.”.

























Sunday, December 11, 2016

Commonly Used Procedure for Multiplicity Adjustment: Fixed Sequence Procedure, Holm Step-down Procedure, Hochberg Step-up Procedure

In clinical trials, we often have the multiple tests or multiplicity issue when there are more than one hypothesis tests built in the same study and we want to claim the trial success if one of multiple hypothesis tests is significant. For example, in steoporosis/breast cancer trial, there may be two endpoints: 
  • Endpoint 1: Incidence of vertebral fractures
  • Endpoint 2: Incidence of breast cancer

We would like to claim the success if at least one endpoint is significant. In a trial with a low dose group, a high dose group, and a placebo control, if we want to claim the success if either lower dose versus placebo or high dose group versus control is statistically significant. In both of these situations, adjustment for multiplicity must be employed.

On the other hand, not all studies with more than one hypothesis tests will need the adjustment for multiplicity. With Alzheimer’s disease trial as example, FDA guidance requires two endpoints
  • Endpoint 1: Cognition endpoint (ADAS-Cog)
  • Endpoint 2: Clinical global scale (CIBIC plus)

and requires that both endpoints must be significant in order to claim success. In this case, both hypotheses are tested at significant level of 0.05 and there is no adjustment for multiplicity is needed.

In late phase clinical trials, if multiplicity issue exists, adjustment for multiplicity must be built into the statistical analysis plan to avoid the inflation of the family-wise type 1 error rate (usually 0.05 or 5%).

Many different approaches have been proposed for handling the multiplicity issue. In a recent article by Wang et al (2015) “Overview of multiple testing methodology and recent development in clinical trials”, the following procedures were reviewed.


Multiple testing procedures for non-hierarchical hypotheses

Non-parametric or semi-parametric procedures
Bonferroni procedure
Simes procedure
Holm step-down procedure
Hochberg step-up procedure
Hommel procedure
Parametric procedures
Dunnett procedure




Multiple testing procedures on hierarchical hypotheses
Simple procedures for hierarchical hypotheses
Fixed-sequence procedure
Fallback procedure


Gatekeeping procedures
Serial gatekeeping procedures
Parallel gatekeeping procedure
Other extensions of gatekeeping procedures

Graphical approaches


In a presentation by Bretz and Xun “introduction to multiplicity in clinical trials” at IMPACT meeting, the multiple testing procedures for non-hierarchical hypotheses were organized based on whether the test is a single step or stepwise and based on whether or not the correlations are considered.
  

 
They also made the following remarks:
·         Single step methods are less powerful than stepwise methods and not often used in practice
·         Accounting for correlations leads to more powerful procedures, but correlations are not always known
·         Simes-based methods are more powerful than Bonferroni-based methods, but control the FWER only under certain dependence structures
·         In practice, we select the procedure that is not only powerful from a statistical perspective, but also appropriate from clinical perspective

For a specific clinical trial with multiplicity issue, the choice of the procedure for multiplicity adjustment depends on the study design, if there is an order in clinical importance of multiple hypothesis tests, or sometimes if there is a prior evidence that one hypothesis test may be more likely to be significant. For example, for a dose-response study, Dunnett procedure or stepdown Dunnett procedure may be preferred. If Multiplicity problems in clinical trials have multiple sources of multiplicity (for example, multiple endpoints + different type of tests (superiority and non-inferiority)), then the gatekeeping procedure may be preferred.


In industry clinical trials, some procedures are more commonly used than others because they are more powerful or more likely to declare the statistical significance. It may usually be the case that the clinical trial sponsor side (the pharmaceutical/biotech companies) would like to choose a procedure that is more powerful (such as Hochberg procedure) while the regulatory side (such as FDA) would prefer a procedure that is more conservative (such as Bonferroni or Holm’s procedure).

We are still waiting for FDA to issue its formal guidance on multiplicity issues. In the meantime, we see that some procedures for handling the multiplicity issue are mentioned in therapeutic area specific guidance or presentations by FDA statisticians. For example, in CDRH’s guidance “Clinical Investigations of Devices Indicated for the Treatment of Urinary Incontinence”, the following paragraph was mentioned in dealing with the multiplicity issue when performing the statistical tests for multiple secondary endpoints.

The primary statistical challenge in supporting the indication for use or device performance in the labeling is in making multiple assessments of the secondary endpoint data without increasing the type 1 error above an acceptable level (typically 5%). There are many valid multiplicity adjustment strategies available for use to maintain the type 1 error rate at or below the specified level, three of which are listed below:
  • Bonferroni procedure
  • Hierarchical closed test procedure
  • Holm’s step-down procedure
Because each of these multiplicity adjustment strategies involves balancing different potential advantages and disadvantages, we recommend you prospectively state the strategy that you intend to use. We recommend your protocol prospectively state a statistical hypothesis for each secondary endpoint related to the indication for use or device performance.


EMA has a guideline “Points to consider on multiplicity issues in clinical trials”. The document was issued in 2002 and might be time for revision. The document mainly focused on when the adjustment for multiplicity is needed and when the adjustment for multiplicity is not needed. There is no mention about the procedures that could be used for multiplicity adjustment.

A recent paper by Sakamaki et al (2016) “Current practice onmultiplicity adjustment and sample size calculation in multi-arm clinicaltrials: an industry survey in Japan” revealed that fixed sequence procedure, gatekeeping procedure, and Hochberg procedure are most commonly used and Holm procedure is rarely used.
 



Assuming that there are two hypothesis tests and the left column indicates the p-values for these two hypothesis tests. Claiming the statistical significance depending on which procedure to use for multiplicity adjustment. In this specific case, the Hochberg step-up procedure is more power than other multiplicity adjustment procedures.




Without any adjustment for multiplicity
Bonferroni correction
Fixed sequence hierarchical

Hochberg step-up Procedure
Compare p1 with 0.05
Compare p2 with 0.05
If  p1 lt  0.025
or
if p2 lt 0.025
If plt 0.05, comparing p2 with 0.05;
If p1 gt 0.05, p2 will not be tested
If min(p1, p2) lt 0.025
Then test
if max(p1, p2) lt 0.05
If max(p1, p2) lt 0.05
then claim both groups are successful;
or
if max(p1, p2) gt 0.05 then test
if min(p1,p2) lt 0.025
p1=0.04
p2=0.03
x
x
p1 gt 0.05
p2=0.03
x
x
x
x
pgt .05
p2=0.02
x
p1=0.04
pgt 0.05
x
x
x
x
p1=0.02
p2=0.02

  
References: