Saturday, December 06, 2014

Clinical trial design for treatment of Ebola virus disease versus Ebola vaccine

Ebola outbreak in West Africa has brought a lot of attentions to this deadly virus. The world is so unprepared for the Ebola treatment and prevention. Now that the developed countries including US are starting to develop the drugs for treating and preventing the Ebola virus, the discussions about the Ebola drug trials are in the center stage.

First of all, there is a big distinction between developing the drugs for treating Ebola virus disease and developing the Ebola vaccine. Many people may discuss the Ebola trial without clear distinction of the treatment and prevention. In a Forbes article “FDA: Some Ebola Patients Need To Get Placebo”, the author clearly mis-interpreted the original FDA paper and blurred the distinction between drug for treating Ebola patients and the Ebola vaccine.

The ethical dilemma and debate about the randomized, controlled trial are on the issues for drug trials targeting the treatment of the Ebola patients, not the Ebola vaccine. FDA’s article in New England Journal of Medicine “Evaluating Ebola Therapies — The Case for RCTs” discussed the issues with clinical trial design for Ebola therapies (treating Ebola infected patients), not for Ebola vaccine (preventing people from Ebola infection).

So far, the majority of the clinical trials are focusing on the Ebola virus vaccine, not on the Ebola treatment. According to, there are 21 clinical trials registered, except for 2 trials that are conducted in ebola patients or ebola virus infested patients, all other trials are for Ebola vaccine and conducted in healthy volunteers

The immediate need is to find effective therapies for treating the Ebola virus infected patients. Down the road, developing effective vaccines to prevent the Ebola virus infection (at least prevent the similar outbreaks) is more important. Without any incentives, the drug companies may be more interested in developing the Ebola vaccine because of its much greater marketing potential.

 A Comparison of Ebola Therapy and Ebola Vaccine

Ebola Therapy
Ebola Vaccine
Treatment of Ebola Virus Infected patients
Prevention people from Ebola virus infection
Target population
Ebola virus infected patients
General public or population at risk for Ebola infection (such as health care workers)
Study population
Ebola virus infected patients
Healthy volunteers
Efficacy Measure
Survival rate (proportion of patients who can survive in two weeks)

the immunogeneicity (i.e., the occurrence or titer of the anti-ebola virus antibodies).

Control group
Placebo-controlled study is not feasible, but the best supportive care as control group is feasible.
Placebo control is feasible.
Study endpoint
Survival is a hard endpoint
Titer of antibodies is a surrogate, soft endpoint. The overall effectiveness is difficult to measure.
Tolerance for safety
Comparing to the vaccine, there may be more tolerance in terms of the safety.
The new drug / therapy must be extremely safe since the vaccine will be used by the healthy people

Specifically for clinical trials to find effective therapies for Ebola virus infected patients, the clinical trial design is at the center of the debate. The experts in European countries prefer the clinical trials using the historical control (i.e., without the concurrent control group) while US (FDA and NIH) prefers the traditional randomized controlled trials with the best supportive care as the control group.

With very high mortality rate, it is understandable to think that a randomized, supportive care controlled or placebo controlled study is ridiculous. If there is a new experimental therapy with even a slim of hope, people will jump on it. Just as it said in the article “The Ethical Issues In Using An Experimental Ebola Drug”.
“the World Health Organization said in a statement today that it is ethical to offer unproven drugs to treat or prevent the spread of the Ebola virus
Under American law, the Food and Drug Administration can permit a drug manufacturer to provide an unapproved drug to patients if they don't have any alternatives and the consequences are severe. It's called "compassionate use" and most of these exceptions are granted when the drug is in a clinical trial testing its safety, proper dose and efficacy.
The most profound example of this comes from the 1980s, in the early days of the AIDS epidemic. There was no approved drug that had any effect, and people were dying. Dr. Anthony Fauci [director of the National Institute of Allergy and Infectious Diseases] was key to changing this approach, and expanding access to AZT outside of clinical trials. But this is different in that the drugs for the Ebola virus have not yet entered clinical trials in humans.”

Considering the high mortality rate and no proven therapy for treating the Ebola virus disease, using a historical control seems to an easy choice. With this design, all patients will be given the experimental drug(s). If the survival rate in patients treated with experimental drug(s) is lower than a fixed number (historical control), the experimental treatment would be considered as effective. However, it all depends on how reliable the historical control is and whether or not the other best supportive cares have changed over time. The study design can still be randomized and controlled. It is just the concurrent control group is another experimental drug(s).

It is generally agreed that the clinical trials for Ebola virus disease treatment should have more than one arms and should be randomized, controlled. The European countries seem to prefer a study design with multiple experimental therapies to compare each other. US (FDA and NIH) seems to prefer a study design with experimental therapy compared with the concurrent control of the best supportive care. This can be essentially viewed as an add-on study design with one group to be the best supportive care only and another group to be the best supportive care + the experimental therapy. For the purpose of demonstrating the efficacy of the experimental therapy, this seems to be the most reasonable approach.

In the end, the action is always better than debate. Let’s put aside the debate and start the clinical trials for Ebola treatment. The clinical trials for Ebola virus disease treatment (new therapies) have begun.
 “US scientists have not yet announced which treatments will be tested in clinical trials that they plan to run in the United States and, possibly, in Liberia. Doctors and researchers organizing the trials met at the US National Institutes of Health in Bethesda, Maryland, on 11 November.
"We had good discussions,” says Clifford Lane, deputy director for clinical research and special projects at the US National Institute of Allergy and Infectious Diseases in Bethesda. “We are working on refining our adaptive-design protocol with specific arms based upon those discussions.”
MSF says that the trials at its sites will test whether the interventions boost the proportion of patients who survive for two weeks. It hopes to report initial results from the trials as early as February 2015.
MSF said previously that none of the trials run at its sites will assign patients to receive standard of care treatment rather than an experimental intervention. Whether or not to use a standard of care control group in these trials is a thorny and hotly debated question. The US trials plan to use a control group, but have not made final decisions about the trial design.”
It is reported that the first Ebola treatment trial has started in January, 2015. The study lead by Dr Jake Dunning is a study without concurrent control group - there is no randomization. The Ebola virus positive patients are asked if they are willing to participate in the trial to receive the experimental treatment. if they decline the participation, the patients will receive the standard supportive care. Hopefully, they will track the at least the mortality rate in those who decline the participation.

The Ebola vaccine trials has also begun and some of the studies have reported the success (safe and generating antibody response in healthy volunteers).

Monday, December 01, 2014

FDA's Priority Review Voucher Programs

A voucher is a bond of the redeemable transaction type which is worth a certain monetary value and which may be spent only for specific reasons or on specific goods. Examples include (but are not limited to) housing, travel, and food vouchers.

You may find it surprising, the voucher has been used by FDA as a tool to encourage the drug development in certain areas.

In 2008, FDA issued its 1st voucher guidance titled “Tropical Disease Priority Review Vouchers”. Last month, FDA has published its second guidance related to voucher. The draft guidance is called “Rare Pediatric Disease Priority Review Vouchers, Guidance for Industry

How does it work?
  • Sponsors must first have an NDA/BLA approved for an indication in designated tropical disease area or in qualified rare pediatric disease area. 
  • Sponsor will then submit the application for priority review voucher
  • Sponsor may need to pay additional application fee for voucher
  • Once the priority review voucher is approved, the voucher can be sold and transferred to other sponsors
  • Voucher can be redeemed for priority review for any NDA/BLA submission 
Both voucher programs are designed to provide incentives for drug developers to invest in the neglected disease area or in the disease area that return on investment (ROI) is very low.
According to FDA's MAPP "Review Designation Policy: Priority (P) and Standard (S)", applications or supplements submitted with a priority review voucher will automatically receive a priority review designation.

The tropical disease priority review voucher was issued in 2008 and it was not used often by the sponsors. However, the tropical disease priority review voucher may find new popularity thanks to the global fight against Ebola. Ironically, at the time when the Tropical Disease Priority Review Voucher guidance was issued, the deadly Ebola disease was not on the list of tropical disease areas.
Product applications for the prevention or treatment of the following tropical diseases may qualify:

• Tuberculosis
• Malaria
• Blinding trachoma
• Buruli Ulcer
• Cholera
• Dengue/Dengue haemorrhagic fever
• Dracunculiasis (guinea-worm disease)
• Fascioliasis
• Human African trypanosomiasis
• Leishmaniasis
• Leprosy
• Lymphatic filariasis
• Onchocerciasis
• Schistosomiasis
• Soil transmitted helminthiasis
• Yaws
• Any other infectious disease for which there is no significant market in developed nations and that disproportionately affects poor and marginalized populations, designated by regulation by the Secretary (section 524(a)(3))

One may argue that the Ebola can be included in the last item “any other infectious disease for which there is no significant market in developed nations and that disproportionately affects poor and marginalized populations, designated by regulation by the Secretary”.

To ensure that Ebola is included in the Tropical Disease Priority Review Voucher program and to remove the obstacles for voucher program to become popular, a Senate bill has been proposed. The bill passed the senate in November.

The newly issued guidance on Rare Pediatric Disease Priority Review Voucher program seems to be better designed and hopefully it will gain more popularity than the Tropical Disease Priority Review Voucher program. To avoid incentivizing sponsors to exclude adults affected by the rare pediatric disease from clinical trials, FDA expects adult patients to play a prominent role in process. Sponsors remain eligible for a voucher if they use adult patients in clinical trials or seek an adult indication in addition to the primary pediatric indication. The qualified rare pediatric disease will most likely also qualify for the orphan disease category. A sponsor may obtain the Orphan Drug Designation Status to avoid paying the application fee for voucher application.

How much is a priority review voucher worth?

The value of a priority review voucher is not entirely clear. There are very few transactions of a priority review voucher sold from one sponsor to another.
  • On 30 July 2014, BioMarin announced that it had sold its voucher to Sanofi and Regeneron for $67.5 million.
  • The Canadian pharmaceutical company Knight Therapeutics has reportedly sold its Neglected Tropical Disease Priority Review Voucher to Gilead Sciences for $125 million
  • On May 27, 2015, Retrophin sold their priority review voucher to Sanofi for $245 million
         Everything you need to know about priority review voucher

Monday, November 24, 2014

FDA's Position on Use of SI Units for Lab Tests

Previously, I wrote an article to discuss the SI unit versus US conventional unit. FDA actually issued its position statement about these two units.

CDER and CBER are evaluating an approach to transition to general acceptance of laboratory data in clinical trials that are measured and reported in Système International (SI) units instead of U.S. Conventional units. The objective is to establish an agency-wide policy on the acceptance of SI units in product submissions.
CDER and CBER recognize that SI units are the worldwide standard and international trials regularly measure and report lab tests using SI units. The Centers also acknowledge that the majority of U.S. healthcare providers are trained using U.S. conventional units. Lab results reported using U.S. conventional units often convey the most clinical meaning to U.S. healthcare providers, including CDER and CBER reviewers. In the absence of a holistic transition within the U.S. healthcare community to SI units, conversion of certain lab test results to U.S. conventional units may be a necessary interim step toward a transition to full SI unit reporting.
CDER and CBER are currently evaluating common and therapeutic area-specific lab tests to determine which pose significant interpretation risks during the review of new drug applications.  While this evaluation is underway, sponsors are strongly encouraged to solicit input from review divisions as early in the development cycle as possible to minimize the potential for conversion needs during NDA/BLA review. CDER and CBER encourage sponsors to discuss this issue with FDA before the start of Phase 3 trials.  In some cases the issue may warrant discussion with FDA at the End-of-Phase 2 meeting.
If conversion requests are received, sponsors are advised to discuss the conversion request as early as possible with the review division and if needed, provide a proposal for what can be reasonably accomplished to meet the review division’s needs without undue burden in time or costs.
October 25, 2013

For a specific clinical trial, it is prudent to ask the central laboratory to report the results in both units. It is not a bid deal for central laboratory to include results in both units in the data sets. for US sites, the lab reports may be in US conventional units and for non-US sites, the lab reports may be in SI units. For data presentations (table, listings, and figures), SI units may be used for international studies and US conventional unit may be used in US only studies. 

Sunday, November 16, 2014

VALOR Trial - A Successful and Failed Phase III Study with Adaptive Sample Size Re-stimation for Promising Zone

Motivated by searching for the innovative clinical trial methodologies to increase the clinical trial success and minimize the clinical trial cost, various adaptive design methods have been proposed. Initially, clinical trials using the adaptive designs are usually in the early phase (phase I or II) clinical trials. For phase III confirmatory clinical trials, the traditional clinical trial methods are still dominating. Many publications about using the adaptive design in late stage trials are based on retrospective assessment or simulation: had the original studies been done with adaptive design, how much cost would have been saved or a failed trial might have been rescued. After many years of education and advocate, adaptive designs with innovative methods in phase III studies have actually been implemented and some of the trial results start to surface. One of the examples is a trial called VALOR –  a phase III, placebo-controlled, randomized, double-blind study in relapsed/refractory Acute Myeloid Leukemia (AML). The study adopted one of the key adaptive design features - the Sample Size Re-estimation (SSR).

The rationale behind the Sample Size Re-estimation is that the assumptions for designing the confirmatory trial is either not entirely available or is available but with a high degree of uncertainty.  This uncertainty could result in the incorrect or inaccurate estimates of sample size during the design stage. With the Sample Size Re-estimation, an interim analysis can be performed during the middle of the study to re-check these assumptions. Depending on the findings from interim analysis, the decision about the next step can be made.

In VALOR study, the Sample Size Re-estimation was based on a Promising Zone approach. The SSR based on Promising Zone was proposed by Mehta and Pocock and described in their paper “Adaptive increase in sample size when interim results are promising: A practical guide with examples”. The general idea is to start a phase III trial with the best or better scenario with optimistic assumptions. The optimistic assumptions will require a trial with smaller sample size to start with and consequently require less commitment in resources and finance in the beginning. During the study, an interim analysis is performed to check the reality and to plan for the next step with the following choices.

  • Stop early if overwhelming evidence of efficacy
  • Stop early for futility if low conditional power
  • Increase the number of sample size if results are promising

This can be illustrated in the diagram below. Notice that with this method, the sample size can only be adjusted up (not down), can only be increased (not decreased). The sample size increase is one-time with pre-specified fixed number preferred.

Since VALOR study was initiated in December 2010, this SSR method with Promising Zone approach had been widely followed in statistical community and had been the topic in many adaptive design discussions. See the presentation by Zoran Antonijevic "Harvard Catalyst Adaptie Clinical Trials Case Study - The VALOR Trial for AML". There is also a youtube video titled "The Phase 3 VALOR Trial: Adaptive Sample Size Re-estimation"

Cytel Inc. had built the SSR with Promising Zone approach in their EAST software for study design. They advocate that adaptive sample size re-estimation in EAST reduces risk and enhances the clinical trial success. With Promising Zone SSR method, an adaptive design can:
  • DE-RISK INVESTMENT – Avoid expensive up-front commitments of sample size
  • ENHANCE SUCCESS – Boost power when initial assumptions fail
  • PROMISING ZONETM – Increase sample size conditional on interim data
  • ALPHA CONTROL – Guarantee strong type I error control required by regulators

Had the VALOR study achieved the primary efficacy endpoint of statistical significance, it would be a wonderful story to tell how the Promising Zone SSR method had De-Risked Investment, Enhanced success.

Unfortunately, after all of these extra efforts (in adaptive design, DSMB, interim analysis, sample size re-estimation), the study failed and did not reach the statistical significance for the primary efficacy endpoint. P-value just missed the magical number of p=0.05. Here is the announcement from the VALOR study sponsor – Sunesis Pharmaceuticals:
Sunesis Announces Results From Pivotal Phase 3 VALOR Trial of Vosaroxin and Cytarabine in Patients With First Relapsed or Refractory Acute Myeloid Leukemia
“Sunesis Pharmaceuticals, Inc. (Nasdaq:SNSS) today announced results from the pivotal Phase 3 VALOR trial, a randomized, double-blind, placebo-controlled trial of vosaroxin and cytarabine in patients with first relapsed or refractory acute myeloid leukemia (AML). At more than 100 leading international sites, the trial enrolled 711 patients, who were stratified for age, geography and disease status. The trial did not meet its primary endpoint of demonstrating a statistically significant improvement in overall survival, with a median overall survival of 7.5 months for vosaroxin and cytarabine compared to 6.1 months for placebo and cytarabine (HR=0.865, p=0.06).”
Additional details about the trial design are coming to surface. See the screen shot from the Sunesis presentation:

The study was planned based on the most optimistic assumption (i.e., HR=0.71) and the sample size re-estimation was based on the most conservative assumption (i.e., HR=0.80) at that time. Unfortunately, the actual result of HR=0.865 was beyond the most conservative assumption of HR=0.80. It would be interesting to know what exactly the HR was from the interim analysis.  

I guess that Sunesis and Cytel are now analyzing the data to search for the clue why the study did not meet the primary endpoint. It is very possible that the study conduct, patient population might be different before and after the interim analysis. While the study team were strictly blinded to the details of the interim analysis results, the decision on whether or not to increase the sample size had to be announced. This announcement could have impact on the patient characteristics or conduct of the study. Here was a discussion about the announcement of increasing the sample size after the interim analysis at that time. It is clear the announcement of increasing the sample size have some impacts on the financial analyst, potentially also have some impacts on the study team / investigators in the study.   

           Sunesis Pharmaceuticals to Implement One-Time Sample Size Increase to Phase 3
           VALOR Trial in AML
 When last September the Data and Safety Monitoring Board (DSMB) recommended expanding the sample size of the study based on interim data that suggested a "promising" outcome, vosaroxin garnered even more investor attention.  Valor Trial Design And Alpha SpendAt the analyst meeting in October 2012, Sunesis provided an update on the adaptive design of the study that allows for a potential one-time sample size increase of the patient population. Based on its review, the DSMB recommended the Valor study increase the sample size to 675 patients for a 90% statistical power to detect a 30% overall survival difference (5 months versus 6.5 months) with an HR of 0.77. The DSMB concluded that the interim data indicated a "promising" outcome - ruling out futility and an "unfavorable" scenario, but falling short of a "favorable" result.
Based on the nuances of statistical analysis, ruling out both favorable and unfavorable scenarios for a promising outcome strongly suggests that vosaroxin was closer to non-inferiority and in need of a larger sample size in order to show a statistically significant treatment difference. It was a smart idea by management to utilize the first interim analysis of Valor as a proxy for a randomized Phase 2 study whereby it could better estimate the sample size needed to demonstrate a clinical effect. Powering the study has thus been the main factor in influencing its "promising" outcome

VALOR study is a well-conducted study. From the standpoint of the study implementation including the sample size re-estimation, the study is a success. However, the study failed to reach the statistical significance for the primary efficacy endpoint.

In the end, the statistics is about the uncertainty. While the sample size re-estimation can reduce the uncertainty to some degree, it can not eliminate the uncertainty. We will never be able to design a study to guarantee the success.

Saturday, November 01, 2014

Standard of Care (SOC) as Control Group in Clinical Trials

For randomized, controlled clinical trials, the selection of the control group is one of the key issues in the study design. This is why ICH has a specific guideline (E10) for “CHOICE OF CONTROL GROUP AND RELATED ISSUES IN CLINICAL TRIALS”. The choice of the control group will decide whether or not the trial is a superiority or non-inferiority study, double-blinded/single-blinded/open label, and will decide the sample size.

It becomes pretty common that the Standard of Care (SOC) may be chosen as the control group. We often run into an issue that for a specific disease (indication), there is no regulatory-approved therapy (existing therapy) and it is not ethical to conduct the Placebo-controlled study, the comparison of experimental therapy versus Standard of Care seems to be the only choice. 

What is the Definition of the SOC?

There is no standard definition for SOC from regulatory guidelines. According to  Webster’s New World Medical Dictionary, SOC is defined as “the level at which the average, prudent provider in a given situation would managed the patient’s care under the same or similar circumstances.”

From National Cancer Institute: “standard of care” is defined as “treatment that experts agree is appropriate, accepted, and widely used. Also called best practice, standard medical care,  and standard therapy.”

There are more definitions, but all similar.
“A standard of care is a formal diagnostic and treatment process a doctor will follow for a patient with a certain set of symptoms or a specific illness. That standard will follow guidelines and protocols that experts would agree with as most appropriate, also called "best practice."
In legal terms, a standard of care is used as the benchmark against a doctor's actual work. For example, in a malpractice lawsuit, the doctor's lawyers would want to prove that the doctor's actions were aligned with the standard of care. The plaintiff's lawyers would want to show how a doctor violated the accepted standard of care and was therefore negligent.”
Standards of care are developed in a number of ways: Sometimes they are simply developed over time, and in other cases, they are the result of clinical findings. In modern era, the SOC are typically based on the evidence-based medicine. The SOC are based on the results of clinical trials, the Meta analysis results if there are multiple clinical trials, and the Cochrane systematic review of evidences. The SOC may come out as suggestions and treatment guidelines issued by the professional societies. There are actually so many treatment guidelines by different professional societies and by different countries. Just to list a couple of treatment guidelines below:

§         National Comprehensive Cancer Network guidelines

§         Evidence-based guideline: Intravenous immunoglobulin in the treatment of neuromuscular disorders

Does A Standard of Care therapy have to be approved by regulatory authority (such as FDA)?

Not necessarily. As a matter of fact, some of the SOCs may not be regulated by FDA at al. For example, the surgery and the plasma exchange are techniques and procedures that may not be part of FDA regulation.

In FDA’s guidance  “Expedited Programs for Serious Conditions – Drugs and Biologics”, SOC was discussed as part of the discussions for ‘available therapy’. The guidance states:

“For purposes of this guidance, FDA generally considers available therapy (and the terms existing treatment and existing therapy) as a therapy that:
  §         Is approved or licensed in the United States for the same indication being considered for the new drug and
 §         Is relevant to current U.S. standard of care (SOC) for the indication
 FDA’s available therapy determination generally focuses on treatment options that reflect the current SOC for the specific indication (including the disease stage) for which a product is being developed. In evaluating the current SOC, FDA considers recommendations by authoritative scientific bodies (e.g., National Comprehensive Cancer Network, American Academy of Neurology) based on clinical evidence and other reliable information that reflects current clinical practice. When a drug development program targets a subset of a broader disease population (e.g., a subset identified by a genetic mutation), the SOC for the broader population, if there is one, generally is considered available therapy for the subset, unless there is evidence that the SOC is less effective in the subset.
 Over the course of new drug development, it is foreseeable that the SOC for a given condition may evolve (e.g., because of approval of a new therapy or new information about available therapies). FDA will determine what constitutes available therapy at the time of the relevant regulatory decision for each expedited program a sponsor intends to use (e.g., generally early in development for fast track and breakthrough therapy designations, at time of biologics license application (BLA) or new drug application (NDA) submissions for priority review designation, during BLA or NDA review for accelerated approval). FDA encourages sponsors to discuss available therapy considerations with the Agency during interactions with FDA.
 As appropriate, FDA may consult with special Government employees or other experts when making an available therapy determination.”

The newly issued  FDA Guidance on Available Therapy echoes the similar opinion:
available therapy (and the terms existing treatments and existing therapy) should be interpreted as therapy that is specified in the approved labeling of regulated products, with only rare exceptions.
 FDA recognizes that there are cases where a safe and effective therapy for a disease or condition exists but it is not approved for that particular use by FDA. However, for purposes of the regulations and policy statements described in Section III, which are intended to permit prompt FDA approval of medically important therapies, only in exceptional cases will a treatment that is not FDA-regulated (e.g., surgery) or that is not labeled for use but is supported by compelling literature evidence (e.g., certain established oncologic treatments) be considered available therapy.”
FDA guidance Non-Inferiority Clinical Trials answered the question if the active comparator for a non-inferiority study can be a product without label. The active comparator could be a SOC.

“Can a drug product be used as the active comparator in a study designed to show non-inferiority if its labeling does not have the indication for the disease being studied, and could published reports in the literature be used to support a treatment effect of the active control?
 The active control does not have to be labeled for the indication being studied in the NI study, as long as there are adequate data to support the chosen NI margin. FDA does, in some cases, rely on published literature and has done so in carrying out the meta-analyses of the active control used to define NI margins. An FDA guidance for industry on Providing Clinical Evidence of Effectiveness for Human Drug and Biological Products describes the approach to considering the use of literature in providing evidence of effectiveness, and similar considerations would apply here. Among these considerations are the quality of the publications (the level of detail provided), the difficulty of assessing the endpoints used, changes in practice between the present and the time of the studies, whether FDA has reviewed some or all of the studies, and whether FDA and the sponsor have access to the original data. As noted above, the endpoint for the NI study could be different (e.g., death, heart attack, and stroke) from the primary endpoint (cardiovascular death) in the studies if the alternative endpoint is well assessed”
How Standard are the Standards of Care?

It depends on the specific disease area and the available treatment. A standard of care in one country, one hospital may not necessarily be the same standard in another. Further, one doctor's standard can vary from another doctor's. In many cases, even though the same therapy is considered as the standard of care, the usage of the therapy may be quite different. For example, the tPA is considered as a standard of care in US to treat the leg attack (peripheral arterial occlusion). However, different medical centers and different doctors may give tPA therapy differently – the differences are reflected in the total amount of the tPA dose, bolus versus continuous infusion, infusion rate, total length of the tPA treatment.

The heterogeneity of the standard of care presents great challenges in conducting clinical trials using the standard of care as the control group. This issue was extensively discussed in FDA’s guidance on Chronic Cutaneous Ulcer and Burn Wounds — Developing Products for Treatment. If we think about doing a multi-national clinical trial with the standard of care as the control group, the challenges will be even greater or the trial is not entirely feasible because of the difficulties in defining the SOC for a specific disease treatment.  Here are the paragraphs from FDA’s guidance concerning about using the Standard of Care as the control group.
“Standard care refers to generally accepted wound care procedures, other than the investigational product, that will be used in the clinical trial. Good standard care procedures in a wound-treatment product trial are a prerequisite for assessing safety and efficacy of a product. Since varying standard care procedures can confound the outcome of a clinical trial, it is generally advisable that all participating centers agree to use the same procedures and these procedures are described within the clinical protocol. If it is not practical to apply uniform standard care procedures across study centers, randomization stratified by study center should be considered. It is also important that the sample size within study centers and wound care records be adequate to assess the effect of wound care variation.
A number of standard procedures for ulcer and burn care are widely accepted. Several professional groups have initiated development of care guidelines for ulcers and burns. The Agency does not require adherence to any specific guidelines, the basic principle being that standard care regimens in wound-treatment product trials should optimize conditions for healing and be prospectively defined in the protocol. The rationale for the standard care chosen should be included in the protocol, and the study plan should be of sufficient detail for consistent and uniform application across study centers. Case report forms (CRFs) should be designed such that, at each visit, investigators describe the type of ulcer or burn care actually delivered (e.g., extent of debridement, use of concomitant medications). For outpatients, the CRF should also capture compliance with standard care measures, including wound dressing, off-loading, and appropriate supportive factors, such as dietary intake.
The value of study site consistency in standard care regimens within a trial cannot be over-emphasized because of the profound effects these procedures have on clinical outcome for burns and chronic wounds. Consistency in standard care regimens is important for minimizing variability and allowing assessment of treatment effect. It may be reasonable to evaluate a single standard care regimen in early trials to minimize this variability. If comparison of an investigational product to more than one commonly used standard care option is desired, the overall development plan should include specific assessment of the effect of these standard care options on the experimental treatment. These common options should be identified and addressed prospectively in clinical trial design including being clearly described in the clinical protocol and compliance captured via the CRFs; criteria for data poolability should be defined prospectively. Every attempt should be made to minimize deviations from the procedures described in the protocol and subject compliance recorded in CRFs. If more than one standard care regimen is used in the same clinical trial, then randomized treatment allocation within strata defined by these options in standard care is important.”

To minimize the heterogeneity of the standard of care, cluster randomization may also be emplyed. As stated in FDA’s guidance “Antibacterial Therapies for Patients With Unmet Medical Need for the Treatment of Serious Bacterial Diseases”, with cluster randomization, “Patients enrolled at sites randomized to the standard-of-care arm would be treated no differently than is usual practice at that site, while patients enrolled at sites randomized to the investigational drug arm would be treated with the investigational drug.”

When a clinical trial uses standard of case as control group, should the study be designed as superiority or non-inferiority?

It depends on whether or not the experimental treatment group is a stand alone (without standard of case) or add-on (on top of the standard of care) therapy.

If the experimental treatment group is an add-on therapy and the experimental treatment is given on top of the existing standard of case, the trial design must be a superiority study to demonstrate that the add-on therapy is superior to the existing standard of case.

If the experimental treatment group is a stand alone therapy and can be given without the standard of care, the trial design can be either non-inferiority or superiority depending on the effect size of the experimental therapy.

In FDA’s guidance “Non-Inferiority Clinical Trials”, the ‘Add-on study’ was suggested  as an alternative to the non-inferiority study design. In the guidance, ‘treatment that are already available’ can include the standards of care. The combo therapy of the novel treatment plus the existing treatment must be shown to be superior to the existing treatment (standard of care alone) or the existing treatment + Placebo.

“Add-on study
In many cases, for a pharmacologically novel treatment, the most interesting question is not whether it is effective alone but whether the new drug can add to the effectiveness of treatments that are already available. The most pertinent study would therefore be a comparison of the new agent and placebo, each added to established therapy. Thus, new treatments for heart failure have added new agents (e.g., ACE inhibitors, beta blockers, and spironolactone) to diuretics and digoxin. As each new agent became established, it became part of the background therapy to which any new agent and placebo would be added. This approach is also typical in oncology, in the treatment of seizure disorders, and, in many cases, in the treatment of AIDS. “

“In this multicenter, randomized, controlled superiority trial, 542 patients scheduled for elective, high-risk abdominal surgery will be included. Patients are allocated to standard care (control group) or early goal-directed therapy (intervention group) using a randomization procedure stratified by center and type of surgery. In the control group, standard perioperative hemodynamic monitoring is applied. In the intervention group, early goal-directed therapy is added to standard care, based on continuous monitoring of cardiac output with arterial waveform analysis.”

Saturday, October 18, 2014

The fixed margin method or the two confidence interval method for obtaining the non-inferiority margin

For non-inferiority clinical trials, the key issue is to pre-specify the non-inferiority margin and the non-inferiority margin has to be based on the historical supporting data from the studies that compare the active control group with Placebo. If there are multiple historical studies comparing the active control group with Placebo, meta analysis will need to be performed. From the meta analysis, the point estimate and the 95% confidence interval will be obtained.

As indicated in FDA’s guidance "Non-Inferiority Clinical Trials", there are essentially two approaches to derive the non-inferiority margin:

“Having established a reasonable assumption for the control agent’s effect in the NI study,  there are essentially two different approaches to analysis of the NI study, one called the fixed  margin method (or the two confidence interval method) and the other called the synthesis method. Both approaches are discussed in later sections of section IV and use the same data  from the historical studies and NI study, but in different ways.”

The guidance further explained the fixed margin method as:
 “in the fixed margin method, the margin M1 is based upon estimates of the effect of the active comparator in previously conducted studies, making any needed adjustments for changes in trial circumstances. The NI margin is then pre-specified and it is usually chosen as a margin smaller than M1 (i.e., M2), because it is usually felt that for an important endpoint a reasonable fraction of the effect of the control should be preserved. The NI study is successful if the results of the NI study rule out inferiority of the test drug to the control by the NI margin or more. It is referred to as a fixed margin analysis because the past studies comparing the drug with placebo are used to derive a single fixed value for M1, even though this value is based on results of placebo-controlled trials (one or multiple trials versus placebo) that have a point estimate and confidence interval for the comparison with placebo. The value typically chosen is the lower bound of the 95% CI (although this is potentially flexible) of a placebo-controlled trial or meta-analysis of trials. This value becomes the margin M1, after any adjustments needed for concerns about constancy. The fixed margin M1, or M2 if that is chosen as the NI margin, is then used as the value to be excluded for C-T in the NI study by ensuring that the upper bound of the 95% CI for C-T is < M1 (or M2). This 95% lower bound is, in one sense, a conservative estimate of the effect size shown in the historical experience. It is recognized, however, that although we use it as a “fixed” value, it is in fact a random variable, which cannot invariably be assumed to represent the active control effect in the NI study.”

Suppose we are planning to design a non-inferiority study to compare a new experimental thrombolytic agent the meta analysis of "Thrombolysis for acute ischaemicstroke"

“Thrombolytic therapy, mostly administered up to six hours after ischaemic stroke, significantly reduced the proportion of patients who were dead or dependent (modified Rankin 3 to 6) at three to six months after stroke (odds ratio (OR) 0.81, 95% confidence interval (CI) 0.72 to 0.90).”

(Existing Thrombolysis Agent) / Placebo = 0.90 (0.90 is the upper bound of 95% confidence interval)

1-0.90 = 0.10 is the treatment effect of Existing Thrombolysis Agent in reduction in patients with unfavorable outcome

If we plan to do a trial to compare the new thrombolytic agent with Existing Thrombolysis Agent and we would like to preserve 50% of the treatment effect of Existing Thrombolysis Agent, the non-inferiority margin would be calculated as:

(new thrombolytic agent / Placebo)               0.90 + 0.10/2
_____________________ __________ =    __ _______              =  1.06
(Existing Thrombolysis Agent / Placebo)               0.90

The non-inferiority margin would be 1.06.

From the non-inferiority trial comparing New Thromblitic Agent with Existing Thrombolysis Agent, we will need to calculate the 95% confidence interval for odds ratio of (New Thrombolytic Agent / Existing Thromblysis Agent). We will then compare the upper bound of this 95% confidence interval with the non-inferiority margin of 1.06 calculated above. The non-inferiority can be declared if the upper bound of this 95% confidence interval is below the non-inferiority margin of 1.06 – This is why the fixed margin method is called two confidence interval method. Two confidence intervals are involved in the study design: the first 95% confidence interval is from the comparison of the Active Control with Placebo from the historical data; the second 95% confidence interval is from the comparison of the new experimental treatment with Active Control from the new non-inferiority trial.

Several comments on the fixed margin method:

1. Depending on the outcome being good or bad, either the lower bound or upper bound of 95% confidence interval of the Active Control versus Placebo should be used when deriving the non-inferiority margin

2. Depending on the statistics being the numeric difference (difference between two means) or ratio (for example, odds ratio, risk ratio, hazard ratio), the treatment effect M1 is based on the 95% CI of the difference (distance from 0) or the ratio (the distance from 1) including odds ratio, risk ratio, hazard ratio.

3.  While it is typical to choose a M2 (non-inferiority margin) to preserve at least 50% of the treatment effect of the active control group in comparison with Placebo, depending on the disease, the number of 50% may be adjusted. In the thrombolytic treatment for ischemia stroke situation, it may be acceptable to preserve 30-40% of the treatment effect of active control. In other words, in terms of the assay sensitivity, we are willing to accept a lose of large percentage of treatment effect of the active control group (over the historical placebo) in order to have a reasonable non-inferiority margin and to have a feasible sample size for the clinical trial.  

4. in some therapeutical areas (for example in antibacterial and orphan disease areas), there is no historical data to support the statistical justification of the non-inferiority margin and there is no data available for the calculate the first 95% confidence interval in deriving the non-inferiority margin. 

Additional Reading: 

Saturday, September 13, 2014

N of 1 Clinical Trial Design and its Use in Rare Disease Studies

In the beginning (February) of this year, I attended a workshop titled “Clinical Trial Design for Alpha-1 Deficiency: A Model for Rare Diseases”. During the meeting, the N of 1 design was mentioned as one of the study methods to address the challenges in clinical trials in rare disease areas.

This was echoed in FDA’s “Public Workshop – Complex Issues in Developing Drug and Biological Products for Rare Diseases”. Session 2: “Complex Issues for Trial Design: Study Design, Conduct and Analysis” had some extensive discussions about the N of 1 trial design and its potential use in rare disease clinical trials.

In a presentation by Dr. Temple in FDA titled “The Regulatory Pathway for Rare Diseases Lessons Learned from Examples of Clinical Study Designs for Small Populations”, N of 1 study design was mentioned along with other methods such as randomized withdrawal, enrichment, crossover designs.  

According to Wikipedia, “an N of 1 trial is a clinical trial in which a single patient is the entire trial, a single case study. A trial in which random allocation can be used to determine the order in which an experimental and a control intervention are given to a patient is an N of 1 randomized controlled trial. The order of experimental and control interventions can also be fixed by the researcher. “

While N of 1 is not commonly used in clinical trials, the concept of the N of 1 method with focusing on the single patient is actually pretty common in the clinical trial setting. There is some similarities between Aggregated N of 1 and the typically crossover design, especially the high order cross over design. For safety assessment in clinical trials, challenge – dechallenge – rechallenge (or CDR) is often used to assess if an event is indeed caused by the drug. CDR can be considered as an simple N of 1 design.
“Challenge–dechallenge–rechallenge (CDR) is a medical testing protocol in which a medicine or drug is administered, withdrawn, then re-administered, while being monitored for adverse events at each stage. The protocol is used when statistical testing is inappropriate due to an idiosyncratic reaction by a specific individual, or a lack of sufficient test subjects and unit of analysis is the individual. During the withdraw phase, the medication is allowed to wash out of the system in order to determine what effect the medication is having on an individual.
 CDR is one means of establishing the validity and benefits of medication in treating specific conditions as well as any adverse drug reactions. The Food and Drug Administration of the United States lists positive dechallenge reactions (an adverse event which disappears on withdrawal of the medication) as well as negative (an adverse event which continues after withdrawal), as well as positive rechallenge (symptoms re-occurring on re-administration) and negative rechallenge (failure of a symptom to re-occur after re-administration). It is one of the standard means of assessing adverse drug reactions in France.”
While N of 1 is the experiment on a single patient, using aggregated single patient  (N-of-1) trials will involve multiple patients – quantitative analyses become more feasible. See examples below for using aggregated N of 1 trials.  
N of 1 clinical trials could involve some complicated statistical analyses. See the discussions below:
N of 1 clinical trial design is rarely discussed in statistical conferences, perhaps because of the perception that not too much statistics is involved in the analysis of N of 1 study data. However, we do see that N of 1 study can be a very effective method in demonstrating the efficacy if the characteristics of the indication/drug fit.
One of the key questions is that the N of 1 study design is only applicable in certain situations – it  depends on the disease characteristics, treatment (short washout period), endpoint (quick measurements). We can see some of the discussions about the situations where the N of 1 study design may be used from the transcripts of the FDA Public Workshop on Complex Issues in Rare Disease Drug Development Complex Issues for Trial Design: Study Design, Conduct  and Analysis:
“Ellis Unger: We have no ... No comments right now, so let me put a question to the group. I  presented us a slide on the N of 1 study, which we almost never see. Just to remind you, the N of 1 study is a scenario where a patient doesn't contribute and end, but of course the treatment contributes an end, and of course the treatment can be capped in a certain number above ... weeks.
 Unless someone has the amount of interest, in which case, you give up on that course, that aborts that course to treatment and then they re-randomize. Are there therapies, disease states people around the table can think off that would be ... where this design could be applicable, because we don't see these studies. Dr. Walton?
 Marc Walton: I'll just mention that by firing away in all the clinical trials I've reviewed, the most powerful piece of evidence about the effectiveness of a drug came from a N of 1 type of study where it was a study with Pulmozyme cystic fibrosis where patients were treated Pulmozyme that are pulmonary function tested, then the Pulmozyme was discontinued and then tested again, and then several cycles, and I think it was maybe five cycles and you saw such remarkably reproducible effects that it was utterly convincing that the drug was effective for that.
The utility comes about though when you have, as you have said, a disorder that has enough stability and drugs that have a short enough washout period, that you are able to have that repeatedly look as if it was a new exposure to the patient. In disorders where we have that and treatments that are expected to have that sort of reversible effect, this N of 1 becomes a truly powerful piece of information, as well worth considering when those circumstances present themselves.
 Ellis Unger: Typically, a company will come in and say, "You randomize to our treatment or placebo. We're going to count the number of exacerbations or pain episodes or whatever over the course of the study." This is basically saying once you have one of these events, we're going to re-randomize you. Just again, so anybody around the room ... Okay, Dr. Summar.
 Marshall Summar: Yeah, it seems like from the intermediary metabolism, the effects where you have frequent attacks of hyperammonemia, acidosis, things like that, that actually might be a fairly ideal group washout for most of the treatment is pretty fast. That seems like a group where that might actually play out pretty well. I have to think about that but it seems to make some sense.
 Ellis Unger: Dr. Kakkis?
Nicole Hamblett: Thanks. I think the N of 1s, studies are incredibly intriguing and I think one thing I need to wrap my head around is the consigning by commons in medications, for instance, if you're measuring exacerbations during on and off periods in their treatment for that event could alter what's going to happen during the next events. I think that's a little bit difficult to the chronic study, but I guess I also wonder what are the parameters for being able to use an N of 1 study or N of 1 studies for your pivotal trial, as well as difficult enough to conduct confirmatory study. How would we define that for these types of newer or more customized study designs?
 Ellis Unger: Well, I think the N of 1 study, again, you have to have a treatment where there's an offset that's reasonably rapid and you're not expecting the effect on the disease to be ... the effect is not lasting. It's not like Dr. Hyde was mentioning in a gene therapy, as that would be the extreme opposite where you couldn't do this, but if you have something where there's an offset in a reasonable amount of time and patients are subjected to repeated events, I think that's the key.
 If it's progression and it happens slowly with time, you're not going to be able to do an N of 1 study, but if you have some episodic issue and you have a drug with a reasonable offset, I think it will lends itself to this and we're talking about a dozen patients to do a study, the whole deal, and that could be your phase 3 study. I mean that the example I showed was just about a dozen patients. You don't need a lot of patients. “

It is clearly that the N of 1 study design is not appropriate for a study with the efficacy endpoint measured for very long period of time. N of 1 study design may be applicable for short–term endpoints (bio-markers, metabolites, …). However, over the last 10-20 years, the direction of regulatory agencies is moving toward to the long-term endpoints. For an enzyme replacement therapy, a drug showing the increase in enzyme level would be considered sufficient for approval 20 years ago. Nowadays, an endpoint measuring the long term clinical benefit may be required. Similarly, for a thrombolytic agent, it is not sufficient to show the thrombolysis in short term, the long-term benefit of the thrombolytic agent will be required. This trend of requiring the long-term measures in efficacy endpoints make the N of 1 study design unlikely to be used in the licensure studies.

Saturday, September 06, 2014

Full Analysis Set and Intention-to-Treat Population in Non-randomized Clinical Trials?

Intention to treatment principle has now been an routine term in the statistical analysis for randomized, controled clinical trials. If a publication is for a randomized, controled clinical trial, it is almost universal that the intention to treatement principle will be mentioned even though the actual analysis may not exactly follow the intention to treat principle in some studies.

Strictly speaking, the intention to treatment principle indicates that the intention to treatment population includes all randomized patients in the groups to which they were randomly assigned, regardless of their adherence with the entry criteria, regardless of the treatment they actually received, and regardless of subsequent withdrawal from treatment or deviation from the protocol. See one of my early articles and the presentation on ITT versus mITT.

According to ICH E9 “STATISTICAL PRINCIPLES FOR CLINICAL TRIALS”, Full Analysis Set (FAS) is identical to the Intentio-to-Treat (ITT) population. It states:

“The intention-to-treat (see Glossary) principle implies that the primary analysis should include all randomised subjects. Compliance with this principle would necessitate complete follow-up of all randomised subjects for study outcomes. In practice this ideal may be difficult to achieve, for reasons to be described. In this document the term 'full analysis set' is used to describe the analysis set which is as complete as possible and as close as possible to the intention-to-treat ideal of including all randomised subjects.

Here both FAS and ITT population are tied to the randomization. However, in the real world, there are also a lot of non-randomized trials, for example, a clinical study without concurrent control, an early phase dose escalation study without a concurrent control, a long-term safety follow up study where all subjects receive the experimental medication. In these situations, since there is no randomization, it is inappropriate to define an ITT population even though the general principle should remain the same, ie, to preserve as many subjects as possible to avoid bias. The issue is that without randomization, what will be trigger point for defining the ITT population? It looks like that the trigger point could be the time of the administration of the first study medication. Instead of allocating subjects in ITT once randomized’, the subject is in ITT ‘once dosed’. This seems to be the case in the following example, according to CSL’s RIASTAP summary basis of approval, the ITT population was defined for a study without concurrent control and without randomization. It implied that the ITT population includes all subjects who received the study medication, which is essentially the same as the Safety population.

For non-randomized studies, it may be better to use Full Analysis Set instead of ITT population. It seems to be logical to define the full analysis set to include any subjects who receive any amount of the study medication. If this definition is used with the trigger point being the first dose of the study medication, most likely, the full anlaysis set and the safety population will be identical. It is not uncommon that we define two populations that is identidical, but use it for different analyses. For safety analyses, the safety population is used; for efficacy analyses, full analysis set is used.

Another term we can use in non-randomized studies is Evaluable Population which is usually defined as any subjects who receive any amount of the study medication and have at least one post-baseline efficacy measurement. Evaluable population in non-randomized clinical trials is similar to the modified ITT population in randomized clinical trials where some randomized subjects are excluded from the analysis with justifiable rationales.

While the ICH E9 did not use the term ‘modified Intention-to-Treat’, the following paragraphs are intented to provide the guidelines or examples when the subjects can be excluded from the full analysis data set or Intention to treatment population:

“There are a limited number of circumstances that might lead to excluding randomised subjects from the full analysis set including the failure to satisfy major entry criteria (eligibility violations), the failure to take at least one dose of trial medication and the lack of any data post randomisation. Such exclusions should always be justified.Subjects who fail to satisfy an entry criterion may be excluded from the analysis without the possibility of introducing bias only under the following circumstances:(i) the entry criterion was measured prior to randomisation;
(ii) the detection of the relevant eligibility violations can be made completely objectively;
(iii) all subjects receive equal scrutiny for eligibility violations; (This may be difficult to ensure in an open-label study, or even in a double-blind study if the data are unblinded prior to this scrutiny, emphasising the importance of the blind review.)
(iv) all detected violations of the particular entry criterion are excluded.
In some situations, it may be reasonable to eliminate from the set of all randomised subjects any subject who took no trial medication. The intention-to-treat principle would be preserved despite the exclusion of these patients provided, for example, that the decision of whether or not to begin treatment could not be influenced byknowledge of the assigned treatment. In other situations it may be necessary to eliminate from the set of all randomised subjects any subject without data post randomisation. No analysis is complete unless the potential biases arising from these specific exclusions, or any others, are addressed.
In some situations, it may be reasonable to eliminate from the set of all randomised subjects any subject who took no trial medication. The intention-to-treat principle would be preserved despite the exclusion of these patients provided, for example, that the decision of whether or not to begin treatment could not be influenced byknowledge of the assigned treatment. In other situations it may be necessary to eliminate from the set of all randomised subjects any subject without data post randomisation. No analysis is complete unless the potential biases arising from these specific exclusions, or any others, are addressed.
Because of the unpredictability of some problems, it may sometimes be preferable to defer detailed consideration of the manner of dealing with irregularities until the blind review of the data at the end of the trial, and, if so, this should be stated in the protocol.”
In summary, while the general principle is the same, the different terms may be preferred to be used depending on a study being a randomized or non-randomized.

Randomized studies
Non-randomized studies
Full Analysis Set

Interesting talks about the Intention to treatment principle: