Saturday, August 15, 2015

Tipping point analysis - multiple imputation for stress test under missing not at random (MNAR)

In a previous post, different imputation methods were summarized by the different missingness assumptions. One method, tipping point approach, has gained the popularity recently as an approach for performing the sensitivity analysis under the missing at not random (MNAR) assumption. In other words, the tipping point approach is like a progressive stress-testing to assess how severe departures from missing at random (MAR) must be in order to overturn conclusions from the primary analysis. If implausible departures from MAR in order to change the results from statistically significance (p<=0.05) to statistically insignificance (p>0.05), the results will be said to be robust to the departure from MAR assumption. We will then be more confident in the results obtained based on statistical methods with the MAR assumptions (such as multiple imputation, mixed model repeated measurements – MMRM). Tipping point approach is not intended for the primary analysis method and is only used for the sensitivity analysis.

Tipping point approach can be seen as a special application of the multiple imputation. It can also be considered as a special case of controlled imputation method (i.e., applying the shift parameter only to the active treatment group, not to the placebo group).

Implementing the tipping point approach include the following steps with the first three steps being the standard multiple imputation (MI) steps:
  1. The missing data are filled in m times to generate m complete data sets.
  2. The m complete data sets are analyzed by using standard procedures.
  3. The results from the m complete data sets are combined for the inference.
  4. Repeat the steps #1 to generate multiple imputed data sets, with a specified shift parameter that adjust the imputed values for observations in the treatment group, not the placebo group).
  5. Repeat the step 2 for the imputed data sets with shift parameter applied.
  6. Repeat the step 3 to obtain the p-value to see if the p-value is still <=0.05.
  7. Repeat the steps 4-6 with more stringent shift parameter applied until the p-value >0.05.
The tipping point approach can be easily implemented using SAS procedures MI and MIANALYZE. A SAS example “Sensitivity Analysis with Tipping-Point Approachprovides the step-by-step instructions how to implement the tipping point approach.

The following papers are also helpful in understanding and implementing the tipping point approach.


Tipping point approach has been discussed in several drug trials:

In Dry Powder Mannitol (DPM) Pharmaxis Pulmonary and Allergy Drugs Advisory Committee (slides are here) January 30, 2013, tipping point approach was used for stress test to see how robust primary analysis method is robust to the departure of the MAR assumption.
They explored the tipping point in the ITT population at which DPM would no longer show a significant effect. To do this, the penalty at each missing time point is increased up to the point that statistical significance is lost. They showed what happens when they stress tested the data even more. They increased the size of penalty for each missing visit in the pattern mixture model up until the point where significance is lost. The penalty would need to be more than 450 mLs at each missing time point before the effect estimate is reduced to 55 mLs and is no longer significant. This means that each patient leaving before week six could be penalized by 1,350 mLs. A tipping point requiring such a large volume does not seem plausible. They challenged the robustness even further, again using the same pattern mixture model, but this time identifying a tipping point when only penalizing the DPM arm but not control. Even applying this extreme method, the tipping point needed to reach 150 mLs before significance was lost. Now, this means that even patients withdrawing before week six in the control arm carry no penalty at all, but, similarly, DPM withdrawals being penalized by 450 mLs.
In FDA’s Statistical Review for NDA 204168 Drug Name: FETZIMA (Levomilnacipran) extended-release capsules 20, 40, 80, and 120 mg Indication: Major Depressive Disorder Applicant: Forest Laboratories, Inc.
A “tipping point” analysis was conducted by increasing the shift parameter beyond the maximum value of 8 considered by the sponsor. The mean difference in MADRS change scores between drug and placebo would loose statistical significance at alpha = 0.05 at a shift parameter of 16 (see Table 16). The value of 16 appears to be rather large and unlikely to be a realistic mean difference at yt+1 between patients that drop-out after the tth visit and patients that continue. The PMM model results are consistent with the primary MMRM model results at the more realistic values of the shift parameter (i.e., 2, 4, …, 14).
Slide presentation “Missing Data Sensitivity Analysis of a Continuous Endpoint – An Example from a Recent Submission” by Arno Fritsch indicated that the tipping point approach was explored as sensitivity analysis for 6MWT endpoint in Riociguat in Pulmonary Arterial Hypertension.
  • Need to increase penalty for riociguat to -71m per visit after drop-out until statistical significance is lost
  • Would imply a very steep decline after drop-out, even giving negative 6MWD values for many patients (Mean 6MWD at baseline 364m, some patients in the 200’s)
  • So positive treatment effect seems unquestionable
Pattern Mixture Model: This analysis allows missing data to be missing not at random (MNAR). A repeated measures ANCOVA model for change in PSP included time as categorical factor, and a factor for completers versus early dropouts, as well as the interaction of completion status by treatment and time.
 Tipping-point Analysis: Analysis of PSP score using an iterative process of worsening last observation carried forward (LOCF) values for only the active treatment group (paliperidone monthly) were implemented.



Friday, August 07, 2015

A SKEPTIC'S GUIDE TO HEALTH NEWS AND DIET FADS; A BOGUS STUDY TREATED AS REAL

In this weekend's NPR on the media, "a skeptic's guide to health news and diet fads" was discussed. It mentioned a story that Johannes Bohannon deliberately designed a bad bogus study to test how the bogus study results was published and cited in the news. See the blog article "I Fooled Millions Into Thinking Chocolate Helps Weight Loss. Here's How" and the news "Study showing that chocolate can help with weight loss was a trick to show how easily shoddy science can make headlines". Just today, I read an article titled "Could Too Many Refined Carbs Make You Depressed?", which is most like a bogus study.

Here is the story from Wikipedia:
Publishing under the name Johannes Bohannon, he produced a deliberately bad study to see how the media would pick up their findings. He worked with a film-maker Peter Onneken who was making a film about junk science in the diet industry with fad diets becoming headline news despite terrible study design and almost no evidence.

Bohannon designed a deliberately bad study with a small sample size, many variables that naturally fluctuate in participants, and a statistician told to deliberately "massage the data" using overfitting and p-hacking. The study's sample size was tiny, measuring 18 different measurements from only 15 participants, who were split into three groups. The purported finding of the study was that eating chocolate could assist weight loss. The GP running the study sums up his dislike of food pseudoscience as a "religion" that teaches “Bitter chocolate tastes bad, therefore it must be good for you.” Two thirds of the participants were female, and natural weight changes due to menstrual cycles were greater than the observed difference between chocolate and low-carb groups. The group who were assigned to the "control" were not asked what their diet contained.

He submitted the manuscript to 20 open access publishers well known for their predatory journals; the article ended up published in the International Archives of Medicine. He invented a fake "diet institute" that lacks even a website, and used the pen name, "Johannes Bohannon," a name that does not have any publications or appear on any website. Bohannon fabricated a press release which was picked up on the front cover of German tabloid Bild, as well as "the Daily Star, the Irish Examiner, Cosmopolitan’s German website, the Times of India, both the German and Indian site of the Huffington Post, and even television news in Texas and an Australian morning talk show."

The few journalists who contacted the scientist asked puff piece questions and no reporter published how many subjects were tested, or quoted independent researchers. Most outlets sought to maximise page views by including "vaguely pornographic images of women eating chocolate." He argues that diet fads are covered like gossip columnists "echoing whatever they find in press releases" rather than evaluating the accuracy of scientific papers.

Bohannon argues that because of the large number of factors in diet and lifestyle, large scale studies are frequently inconclusive, even when billions of dollars have been spent on well-designed studies by government agencies that label obesity an epidemic.

The original paper "chocolate with high cocoa content as a weight-loss accelerator" can be read at: http://www.scribd.com/doc/266969860/Chocolate-causes-weight-loss. The statistics section of the paper is below. Looks real, right?
A t-test for independent samples was used to assess differences in baseline variables between the groups. The analysis was a repeated-measures analysis of variance in which the baseline value was carried forward in the case of missing data. One subject (low-carbohydrate) had to be excluded from the analysis, because of a weight measure is-sue within the trial

Unfortunately, in today's world, a lot of published studies were based on the bad science. The bogus studies can be written and published as if it is a real study. The news media would pick it up, disseminate and broadcast it like the great news.

The podcast from PBS is available below.






Saturday, August 01, 2015

Missing Data Mechanisms/Assumptions and the Corresponding Imputation Methods

Missing data is one of the classical issues in clinical trials and biostatistics. Since the National Research Council's report on missing data is issued in 2010, the paradigm has been shifted to the prevention of the missing data. Even the prevention has been given the great emphasis, the missing data is still inevitable in pretty much any clinical trial. When analyzing a clinical trial with the missing data, it is common that various sensitivity analyses need to be performed to see how the study result is robust to the handling of the missing data. Handling of the missing data depends on the assumptions. 

Missing Data Assumptions and the Corresponding Imputation Methods 

No assumption
MCAR
MAR
MNAR

Missing Complete at Random
Missing at Random – ignorability assumption
Missing Not at Random

The missingness is independent of both unobserved and observed data.

The probability of missingness is the same for all units.
Conditional on the observed data, the missingness is independent of the unobserved measurements.

The probability a variable is missing depends only on available information.
Not MCAR or MAR.

Missingness that depends on unobserved predictors.

Missingness is no longer at random if it depends on information that has not been recorded and this information also predicts the missing values.

Missingness that depends on the missing value itself

Ignorable
Ignorable
Non-ignorable
LOCF (last observation carried forward)

BOCF (baseline value carried forward)

WOCF (worst observation carried forward)

Imputation based on logical rules
CC (Complete-case Analysis) - listwise deletion

Pairwise Deletion

Available Case analysis

Single-value Imputation (for example, mean replacement, regression prediction (conditional mean imputation), regression prediction plus error (stochastic regression imputation )

– under MCAR, throwing out cases with missing data does not bias your inferences. However, there are many drawbacks
Maximum Likelihood using the EM algorithm – FIML (full information maximum likelihood)

MMRM (mixed model repeated measurement) – REML (restricted maximum likelihood)

Multiple Imputation

Two assumptions: the joint distribution of the data is multivariate normal and the missing data mechanism is ignorable

Under MAR, it is acceptable to exclude the missing cases, as long as the regression controls for all the variables that affect the probability of missingness
PMM (Pattern-mixture modeling)
Jump to Reference
Last Mean Carried Forward.
Copy Differences in Reference
Copy Reference

Tipping Point Approach

Selection model (Heckman)




 
Web resources are available in discussing the missing data and the handling of the missing data. Some of the recent materials are listed below. For people who are using SAS, SAS procedures MI and MIANALYZE are handy for use in performing the multiple imputation and pattern mixture model:


Tuesday, July 28, 2015

Clinical Research Toolkit by NIH and NCI

NIH (National Institute of Health) has been the force in conducting the landmark clinical trials and conducting the clinical trials in the disease areas that pharmaceutical companies are either not interested in or cannot afford to conduct the trials. NCI (National Cancer Institute) plays the prominent role in conducting the clinical trials in various type of cancers.

These government agencies now also design websites to help with the conduct of the clinical trials. Last year, The National Institutes of Health's (NIH) National Institute for Allergy and Infectious Diseases (NIAID) launched a new website meant to make complying with clinical trial regulations around the world substantially easier. The tool is known as ClinRegs (http://clinregs.niaid.nih.gov/index.php). As described by NIAID officials, it's an "online database of country-specific clinical research regulatory information designed to save time and effort in planning and implementing clinical research." With this tool, users can look up clinical data on 12 of the most popular countries for clinical research, including the US, China, India, Brazil and South Africa. Additional countries will be added in the near future according to NIH priorities, the ClinRegs team told Regulatory Focus in a statement.

Various clinical research toolkits are available on NIH’s websites. These toolkits provided the policies, guidance, templates (protocol, ICF, Data Management,…), and other essential documents.

NCI also maintain the CTCAE (Common Terminology Criteria for Adverse Events) that has been the standard for reporting and assessing the AE severity. See my previous post about “Dose Limiting Toxicity (DLT) and Common Toxicity Criteria (CTC) / Common Terminology Criteria for Adverse Events (CTCAE)

Saturday, July 18, 2015

Dose Linearity versus Dose Proportionality

In early phase studies of the drug development, dose linearity and dose proportionality are usually tested. It is essential to determine whether the disposition a new drug are linear or nonlinear. Drugs which behave non-linearly are difficult to use in clinics, especially if the therapeutic window is narrow. if non-linearity is observed for the usual therapeutic concentration range, more clinical studies/tests are needed for the drug development program and drug development can even be stopped. EMA guidance “GUIDELINE ON THE INVESTIGATION OF BIOEQUIVALENCE”, “Guideline on the pharmacokinetic and clinical evaluation of modified release dosage forms”, and FDA Guidance “Bioavailability and Bioequivalence Studies Submitted in NDAs or INDs — General Considerations” specifically requires the test of dose linearity or dose proportionality.
The concept of dose linearity and dose proportionality are often confused because they are very closely related. It can be said that the dose proportionality is a special case of dose linearity or a subset of the dose linearity.

To test the dose linearity or dose proportionality, the clinical trials are often designed as:
  • Dose escalation study
  • Parallel group study with various dose groups
  • Cross-over design with various dose groups

In practice, people usually only test for the dose proportionality. To test for dose proportionality, there are generally four approaches:

Analysis of Variance Approach

In this approach, the dose-normalized PK parameters (AUC or Cmax) will be calculated. The dose-normalized values will then be analyzed by ANOVA approach. Dose normalization is simply the PK parameter divided by dose. With AUC as an example, we can construct the hypothesis as the following: 

          H0: AUC(dose1) / Dose 1 = AUC(dose2) / Dose 2 = AUC(dose3) / Dose 3

If null hypothesis H0 is not rejected, there is no evidence against the dose proportionality. The dose proportionality is then declared. 

Linear Regression Approach

In this approach, the linear regression with quadratic polynomial term of dose will be fit. The PK parameters (AUC or Cmax) will be the dependent variable and dose will be the independent variable.

            Y=alpha + beta1*Dose + beta2*Dose^2 + error

Where the hypothesis is whether beta2 and alpha equal to zero. If either beta2 or alpha is significantly different from zero, the dose proportionality will not be declared. In beta2 is not significant different from zero, the above linear regression is simplified as:

           Y=alpha + beta*Dose + error

If alpha is not significantly different from zero, then dose proportionality is declared
If alpha is significantly different from zero, then dose proportionality cannot declared, but the dose linearity can be declared.

Power Model Approach

In this approach, the relationship between PK parameters (AUC or Cmax) and the dose can be described by the following power model.

          Y=exp(alpha) * Dose^beta * exp(error)

This model can be re-written as:

           ln(Y) = alpha + beta*ln(dose) + error

The slope, beta, measures the proportionality between dose and the PK parameters. If beta=0, it implies that the response is independent from dose. If beta=1, the dose proportionality can be declared. The power essentially tests whether or not the beta = 1.

Equivalence (interval) Approach For Power Model Approach

Based on the power model, Brian Smith et al proposed a bioequivalence approach in their paperConfidence interval criteria for assessment of dose proportionality”. This approach is concisely described in paper by Zhou et al.

 




EXAMPLE:

In a paper by Campos et al, the dose proportionality was evaluated to compare the 120 mg/kg dose versus 60 mg/kg dose. They first normalized the AUC and Cmax to 60 mg/kg dose. The dose normalized values were then used for ANOVA analysis (mixed model approach as described in previous topic since the study design was a crossover design). They concluded the dose proportionality based on the 90% confidence interval of geometric least square mean ratio (0.83-0.88 for AUC and 0.85-0.92 for Cmax) fell within 80-125% equivalence limits.  

REFERENCES:

Friday, July 03, 2015

Protocol Deviation versus Protocol Violation and its Classifications (minor, major, critical, important)

Every clinical trial will have a study protocol. The investigator is required to follow the study protocol to conduct the study. However, during the clinical trial, there will always be planned or unplanned deviations from the protocol. ICH GCP requires that these protocol deviations are documented. ICH E6 (section 4.5.3) states “the investigator, or person designated by the investigator, should document and explain any deviation from the approved protocol.” At the end of the study, statistical analysis will include a listing for all protocol deviations and a summary table for protocol deviations by category.

Across various regulatory guidelines, both terms ‘protocol deviations’ and ‘protocol violations’ are used. What is the difference between a protocol deviation and a protocol violation?

For a while, there seems to be a thinking that the protocol deviation is less serious non-compliance and the protocol violation is more serious non-compliance of the protocol. However, the recent documents from the regulatory bodies suggest that both terms are the same and can be used interchangeably. In practice, it will not be wrong if we stick to the term ‘protocol deviation’ and avoid using the term ‘protocol violation’.

In FDA’s “Compliance Program Guidance Manual For FDA Staff - Compliance Program 7348.811 Bioresearch Monitoring: Clinical Investigators” in 2008. It provided a definition for ‘protocol deviation’, however, the term ‘protocol deviation/violation’ was lumped together and did not draw a clear distinction between protocol deviation and protocol violation.


“Protocol deviations. A protocol deviation/violation is generally an unplanned excursion from the protocol that is not implemented or intended as a systematic change. A protocol deviation could be a limited prospective exception to the protocol (e.g. agreement between sponsor and investigator to enroll a single subject who does not meet all inclusion/exclusion criteria). Like protocol amendments, deviations initiated by the clinical investigator must be reviewed and approved by the IRB and the sponsor prior to implementation, unless the change is necessary to eliminate apparent immediate hazards to the human subjects (21 CFR 312.66), or to protect the life or physical wellbeing of the subject (21 CFR 812.35(a)(2)), and generally communicated to FDA. “Protocol deviation” is also used to refer to any other, unplanned, instance(s) of protocol noncompliance. For example, situations in which the investigator failed to perform tests or examinations as required by the protocol or failures on the part of study subjects to complete scheduled visits as required by the protocol, would be considered protocol deviations.”

In ICH E3 Guideline: Structure and Content of Clinical Study Reports Questions & Answers in 2012, both ‘protocol deviation’ and ‘protocol violation’ were used. The document suggested protocol violation is equivalent to important protocol deviation. In other words, the protocol violation is a subset of all protocol deviations.

A protocol deviation is any change, divergence, or departure from the study design or procedures defined in the protocol. Important protocol deviations are a subset of protocol deviations that may significantly impact the completeness, accuracy, and/or reliability of the study data or that may significantly affect a subject's rights, safety, or well-being. For example, important protocol deviations may include enrolling subjects in violation of key eligibility criteria designed to ensure a specific subject population or failing to collect data necessary to interpret primary endpoints, as this may compromise the scientific value of the trial. Protocol violation and important protocol deviation are sometimes used interchangeably to refer to a significant departure from protocol requirements. The word “violation” may also have other meanings in a regulatory context. However, in Annex IVa, Subject Disposition of the ICH E3 Guideline, the term protocol violation was intended to mean only a change, divergence, or departure from the study requirements, whether by the subject or investigator, that resulted in a subject’s withdrawal from study participation. (Whether such subjects should be included in the study analysis is a separate question.) To avoid confusion over terminology, sponsors are encouraged to replace the phrase “protocol violation” in Annex IVa with “protocol deviation”, as shown in the example flowchart below. Sponsors may also choose to use another descriptor, provided that that the information presented is generally consistent with the definition of protocol violation provided above. The E3 Guideline provides examples of the types of deviations that are generally considered important protocol deviations and that should be described in Section 10.2 and included in the listing in Appendix 16.2.2. The definition of important protocol deviations for a particular trial is determined in part by study design, the critical procedures, study data, subject protections described in the protocol, and the planned analyses of study data. In keeping with the flexibility of the Guideline, sponsors may amend or add to the examples of important deviations provided in E3 in consideration of a trial’s requirements. Substantial additions or changes should be clearly described for the reviewer.

When protocol deviations are documented, they are also classified into categories according to the severity and their effect on the subject’s rights, safety, or welfare, or on the integrity of the resultant data.  
ICH E3 “STRUCTURE AND CONTENT OF CLINICAL STUDY REPORTS” requires the important protocol deviation to be described. It does not use the categories of critical, major, or minor. However the descriptions In Section 10.2 suggests the important protocol deviations are those with major or critical categories. Section 10.2 stated:

All important deviations related to study inclusion or exclusion criteria, conduct of the trial, patient management or patient assessment should be described. In the body of the text, protocol deviations should be appropriately summarised by centre and grouped into different categories, such as: 

− those who entered the study even though they did not satisfy the entry criteria;

− those who developed withdrawal criteria during the study but were not withdrawn;

− those who received the wrong treatment or incorrect dose;

− those who received an excluded concomitant treatment. 

In appendix 16.2.2, individual patients with these protocol deviations should be listed, broken down by centre for multicentre studies.


In US, while there is no formal guidance, the protocol deviations are usually classified as major or minor categories. For example, in a FDA presentation about “Avoiding Common Mistakes in Clinical Research”, the protocol deviation spectrum contains minor (a missed lab test, a missed visit) and major (ineligible subject enrolled, safety or efficacy assessments not done, did not report SAE to IRB • • • •).

In EU, EMA guidance “Classification and analysis of the GCP inspection findings of GCP inspections conducted at the request of the CHMP”, the protocol deviations are classified as Critical, Major, and Minor categories.

 
Critical: - Conditions, practices or processes that adversely affect the rights, safety or well-being of the subjects and/or the quality and integrity of data.
- Critical observations are considered totally unacceptable.
- Possible consequences: rejection of data and/or legal action required.
- Remarks: observations classified as critical may include a pattern of deviations classified as major, bad quality of the data and/or absence of source documents. Manipulation and intentional misrepresentation of data belong to this group.
 
Major: - Conditions, practices or processes that might adversely affect the rights, safety or well-being of the subjects and/or the quality and integrity of data.
- Major observations are serious findings and are direct violations of GCP principles.
             - Possible consequences: data may be rejected and/or legal action required.
             - Remarks: observations classified as major, may include a pattern of deviations
                    and/or numerous   minor observations.
 
Minor: - Conditions, practices or processes that would not be expected to adversely affect the right, safety or well-being of the subjects and/or the quality and integrity of data.
- Possible consequences: observations classified as minor, indicate the need for improvement of conditions, practices and processes.
            - Remarks: many minor observations might indicate a bad quality and the sum might
                be equal to a major finding with its consequences.
 

In practice, the critical and major protocol deviations may be grouped together. At least this is how it is done in our of NIH studies. See Protocol Deviations CRF Module Instructions

Protocol Deviation Discussion at Firstclinical.com:

Sunday, June 14, 2015

Sample Size and Power Calculation Using SAS Proc Power and twosamplesurvival Statement

For sample size and power calculations, several commercially available software can be used. The commonly used ones are EAST, PASS, and NQuery Advisor. SAS has a procedure (PROC POWER) that can be used for sample size and power calculations for many types of the study designs / study endpoints. One of the statements (twosamplesurvival) in Proc Power is for comparing two survival curves and calculating the sample size/power for time to event variable.

The syntax and descriptions for Twosamplesurvival statement in PROC POWER can be found on SAS website. It can be used to calculate:

  • the total number of events needed (EVENTSTOTAL = .  Option)
  • the total number of subjects needed (NTOTAL = . Option)
  • the number of subjects needed per treatment group (NPERGROUP=. Option)
  • the statistical power (POWER=. Option)

Notice that only one option designated as the result is allowed. If we want to get both the total number of events and the total number of subjects, we would need to run the program two times: one for solving the total number of events and one for solving the total number of subjects.

Here are some of the example applications of using twosamplesurvival statement.

EXAMPLE #1:

In a SUGI paper "Proc Power in SAS 9.1" by Bauer, Lavery, and Ford, an example was provided to calculate the sample size for log-rank test with 2:1 randomization ratio and with drop out.

The example assumes 30% of placebo patients are sustained responders (exponential hazard =0.3567) compared to 45 or 50% for the treatment group (exp. hazard = 0.5978 or 0.6931). Twice as many patients are on treatment as placebo, and all patients are enrolled at the beginning of the study with a 30% drop-out rate.

Prior to the sample size calculation, the event rates were converted to hazards. Exponential hazard in Placebo group = - ln(1 - event rate) = -ln(1-0.3) = 0.3567.  Similarly, Exponential hazards corresponding to 45% or 50% event rate were 0.5978 and 0.6931.

The dropout rate were also converted to group loss hazards in the same way. Therefore, the 30% dropout rate was corresponding to the group loss hazard of -ln(1-dropout rate)=-ln(1-0.3)=0.3567.

Groupweights statement was used to indicate the 2:1 randomization ratio. 

proc power;
       twosamplesurvival test=logrank
       gexphs= 0.3567 | 0.5978 .6931
       grouplossexphazards=(0.3567 0.3567)
       accrualtime = 1
       followuptime = 1
       groupweights = (1 2)
       power = .
       ntotal=225;
run;


EXAMPLE #2:

Dr Hudgens from UNC had a nice posting about the power and sample size calculations for log-rank test. He gave an example as following:


Clinical trial to assess new treatment for patients with chronic active hepatitis. Under standard treatment, 41% of patients survive beyond 5 years. Expect new treatment to increase survival beyond 5 years to 60%.


In order to calculate the sample size, we will need to calculate some parameters.
Event rate for standard treatment (Ec) = 1-0.41 = 0.59
Event rate for new treatment (Et) = 1-0.60 = 0.4
Since event rate E = 1 - exp(-t*HAZARD), we have HAZARD = -ln((1-E)/t   
The Hazard for standard treatment is HAZARDc=-ln(1-Ec)/t = -ln(1-0.59)/t = -ln(0.41)/t
The Hazard for new treatment is t* HAZARDt = -ln(1-Et)/t = -ln(1-0.40)/t = -ln(0.60)/t
The hazard ratio  = HAZARDt/HAZARDc = ln((0.6)/ln(0.41)=0.5729
T=5, the hazard for standard treatment is HAZARDc = -ln(0.41)/5 = 0.178

After these calculation, the following SAS codes can be used to calculate the sample size:

proc power;
    twosamplesurvival test=logrank
    hazardratio = 0.57
    refsurvexphazard=0.178
    followuptime = 5
    totalTIME = 5
    power = 0.90
    ntotal = . ;
run;

EXAMPLE #3: Sample Size Calculation with piecewise linear survival curve

SAS has a GUI desktop application PSS (the Power and Sample Size Application) that provides easy access to power analysis and sample size determination techniques. Anything implemented in PSS desktop application can also be realized using Proc Power. Here is a link to an example from using PSS desktop application. The calculation can be realized using Proc Power Twosamplesurvival.


Suppose you want to compare survival rates for an existing cancer treatment and a new treatment. You intend to use a log-rank test to compare the overall survival curves for the two treatments. You want to determine a sample size to achieve a power of 0.8 for a two-sided test using a balanced design, with a significance level of 0.05.

The survival curve of patients for the existing treatment is known to be approximately exponential with a median survival time of five years. You think that the proposed treatment will yield a survival curve described by the times and probabilities listed in Table 69.9. Patients are to be accrued uniformly over two years and followed for three years.

Table 69.9 Survival Probabilities for Proposed Treatment
Time
Probability
1
0.95
2
0.90
3
0.75
4
0.70
5
0.60


The descriptions for using PSS desktop application for this example can be found on SAS website. The following program will do exactly the same.

proc power;
      twosamplesurvival test=logrank
       curve("Existing Treatment") = 5 : 0.5
      curve("Proposed Treatment") = 1 : 0.95 2 : 0.90 3:0.75  4:0.70 5:0.60
      groupsurvival = "Existing Treatment" | "Proposed Treatment"
      accrualtime = 2
      FOLLOWUPTIME = 3
      power = 0.80
      alpha=0.05
      npergroup = . ;
run;


EXAMPLE #4:
twosamplesurvival statement embedded in PROC SEQDESIGN can be used to estimate the sample size for group sequential design with interim analyses.

EXAMPLE #5:
In the following SAS program to calculate the sample size, the survival probability at 12 months are for standard and proposed groups are specified and the statement of grouplossexphazards is used to account for the dropout rate.

proc power;
      twosamplesurvival test=logrank
      curve(“Standard”) = 12 : 0.8781
      curve(“Proposed”) = 12 : 0.9012
      groupsurvival = “Standard” | “Proposed”
      accrualtime = 18
      Totaltime = 24
      GROUPLOSSEXPHAZARDS = (0.0012 0.0012)
      NSUBINTERVAL = 1
      power = 0.85
      ntotal = . ;
run;