Sunday, November 16, 2014

VALOR Trial - A Successful and Failed Phase III Study with Adaptive Sample Size Re-stimation for Promising Zone

Motivated by searching for the innovative clinical trial methodologies to increase the clinical trial success and minimize the clinical trial cost, various adaptive design methods have been proposed. Initially, clinical trials using the adaptive designs are usually in the early phase (phase I or II) clinical trials. For phase III confirmatory clinical trials, the traditional clinical trial methods are still dominating. Many publications about using the adaptive design in late stage trials are based on retrospective assessment or simulation: had the original studies been done with adaptive design, how much cost would have been saved or a failed trial might have been rescued. After many years of education and advocate, adaptive designs with innovative methods in phase III studies have actually been implemented and some of the trial results start to surface. One of the examples is a trial called VALOR –  a phase III, placebo-controlled, randomized, double-blind study in relapsed/refractory Acute Myeloid Leukemia (AML). The study adopted one of the key adaptive design features - the Sample Size Re-estimation (SSR).

The rationale behind the Sample Size Re-estimation is that the assumptions for designing the confirmatory trial is either not entirely available or is available but with a high degree of uncertainty.  This uncertainty could result in the incorrect or inaccurate estimates of sample size during the design stage. With the Sample Size Re-estimation, an interim analysis can be performed during the middle of the study to re-check these assumptions. Depending on the findings from interim analysis, the decision about the next step can be made.

In VALOR study, the Sample Size Re-estimation was based on a Promising Zone approach. The SSR based on Promising Zone was proposed by Mehta and Pocock and described in their paper “Adaptive increase in sample size when interim results are promising: A practical guide with examples”. The general idea is to start a phase III trial with the best or better scenario with optimistic assumptions. The optimistic assumptions will require a trial with smaller sample size to start with and consequently require less commitment in resources and finance in the beginning. During the study, an interim analysis is performed to check the reality and to plan for the next step with the following choices.

  • Stop early if overwhelming evidence of efficacy
  • Stop early for futility if low conditional power
  • Increase the number of sample size if results are promising

This can be illustrated in the diagram below. Notice that with this method, the sample size can only be adjusted up (not down), can only be increased (not decreased). The sample size increase is one-time with pre-specified fixed number preferred.

Since VALOR study was initiated in December 2010, this SSR method with Promising Zone approach had been widely followed in statistical community and had been the topic in many adaptive design discussions. See the presentation by Zoran Antonijevic "Harvard Catalyst Adaptie Clinical Trials Case Study - The VALOR Trial for AML". There is also a youtube video titled "The Phase 3 VALOR Trial: Adaptive Sample Size Re-estimation"

Cytel Inc. had built the SSR with Promising Zone approach in their EAST software for study design. They advocate that adaptive sample size re-estimation in EAST reduces risk and enhances the clinical trial success. With Promising Zone SSR method, an adaptive design can:
  • DE-RISK INVESTMENT – Avoid expensive up-front commitments of sample size
  • ENHANCE SUCCESS – Boost power when initial assumptions fail
  • PROMISING ZONETM – Increase sample size conditional on interim data
  • ALPHA CONTROL – Guarantee strong type I error control required by regulators

Had the VALOR study achieved the primary efficacy endpoint of statistical significance, it would be a wonderful story to tell how the Promising Zone SSR method had De-Risked Investment, Enhanced success.

Unfortunately, after all of these extra efforts (in adaptive design, DSMB, interim analysis, sample size re-estimation), the study failed and did not reach the statistical significance for the primary efficacy endpoint. P-value just missed the magical number of p=0.05. Here is the announcement from the VALOR study sponsor – Sunesis Pharmaceuticals:
Sunesis Announces Results From Pivotal Phase 3 VALOR Trial of Vosaroxin and Cytarabine in Patients With First Relapsed or Refractory Acute Myeloid Leukemia
“Sunesis Pharmaceuticals, Inc. (Nasdaq:SNSS) today announced results from the pivotal Phase 3 VALOR trial, a randomized, double-blind, placebo-controlled trial of vosaroxin and cytarabine in patients with first relapsed or refractory acute myeloid leukemia (AML). At more than 100 leading international sites, the trial enrolled 711 patients, who were stratified for age, geography and disease status. The trial did not meet its primary endpoint of demonstrating a statistically significant improvement in overall survival, with a median overall survival of 7.5 months for vosaroxin and cytarabine compared to 6.1 months for placebo and cytarabine (HR=0.865, p=0.06).”
Additional details about the trial design are coming to surface. See the screen shot from the Sunesis presentation:

The study was planned based on the most optimistic assumption (i.e., HR=0.71) and the sample size re-estimation was based on the most conservative assumption (i.e., HR=0.80) at that time. Unfortunately, the actual result of HR=0.865 was beyond the most conservative assumption of HR=0.80. It would be interesting to know what exactly the HR was from the interim analysis.  

I guess that Sunesis and Cytel are now analyzing the data to search for the clue why the study did not meet the primary endpoint. It is very possible that the study conduct, patient population might be different before and after the interim analysis. While the study team were strictly blinded to the details of the interim analysis results, the decision on whether or not to increase the sample size had to be announced. This announcement could have impact on the patient characteristics or conduct of the study. Here was a discussion about the announcement of increasing the sample size after the interim analysis at that time. It is clear the announcement of increasing the sample size have some impacts on the financial analyst, potentially also have some impacts on the study team / investigators in the study.   

           Sunesis Pharmaceuticals to Implement One-Time Sample Size Increase to Phase 3
           VALOR Trial in AML
 When last September the Data and Safety Monitoring Board (DSMB) recommended expanding the sample size of the study based on interim data that suggested a "promising" outcome, vosaroxin garnered even more investor attention.  Valor Trial Design And Alpha SpendAt the analyst meeting in October 2012, Sunesis provided an update on the adaptive design of the study that allows for a potential one-time sample size increase of the patient population. Based on its review, the DSMB recommended the Valor study increase the sample size to 675 patients for a 90% statistical power to detect a 30% overall survival difference (5 months versus 6.5 months) with an HR of 0.77. The DSMB concluded that the interim data indicated a "promising" outcome - ruling out futility and an "unfavorable" scenario, but falling short of a "favorable" result.
Based on the nuances of statistical analysis, ruling out both favorable and unfavorable scenarios for a promising outcome strongly suggests that vosaroxin was closer to non-inferiority and in need of a larger sample size in order to show a statistically significant treatment difference. It was a smart idea by management to utilize the first interim analysis of Valor as a proxy for a randomized Phase 2 study whereby it could better estimate the sample size needed to demonstrate a clinical effect. Powering the study has thus been the main factor in influencing its "promising" outcome

VALOR study is a well-conducted study. From the standpoint of the study implementation including the sample size re-estimation, the study is a success. However, the study failed to reach the statistical significance for the primary efficacy endpoint.

In the end, the statistics is about the uncertainty. While the sample size re-estimation can reduce the uncertainty to some degree, it can not eliminate the uncertainty. We will never be able to design a study to guarantee the success.


Anonymous said...

Thanks for the article. I was curious what your thoughts were on the percentage of patients that were censored. Does this call into question the results?

Web blog from Dr. Deng said...

In order to know these numbers, we have to wait for the publication of the VALOR study.

Watch for their publications at: