Tuesday, September 01, 2020

Finkelstein-Schoenfeld Method, Win Ratio, and Hodges-Lehman Estimates - Statistical Methods Based on All Paired Comparisons

Finkelstein-Schoenfeld methods can be used in analyzing the data with a composite endpoint where different components for the composite endpoint have different levels of importance. Hodges-Lehmann estimate is used to estimate the magnitude of treatment difference in a non-parametric statistical test such as the Wilcoxon Rank Test. What is in common between these two methods? Well, both methods are based on the pairwise comparisons - the value/outcome from each subject in treatment group A is compared to each of all subjects in treatment group B - in other words, both methods are based on n (# of subjects in treatment group A) time m (# of subjects in treatment group B) comparisons. 

In clinical trials for serious conditions, but not deadly enough, a composite endpoint is often used as the primary efficacy endpoint. The composite endpoint usually consists of several categories (or components) with different degrees of importance because there will not be enough events for a single category for a feasible clinical trial. The examples of composite endpoints are: 
  • a composite endpoint in heart failure may include death, hospitalization, and clinical status
  • a composite endpoint in pulmonary arterial hypertension may include death, hospitalization, and disease progression; 
  • a composite endpoint in cardiovascular outcome study may be the major adverse cardiovascular events (MACE) consisting of death; MI; stroke, hospitalization.
Usually, these different components are not weighted and treated as equally important and the statistical analyses are based on the time to first event (no matter if the first event is death, hospitalization, or others) - this approach of no weighting is the focal point being criticized. 

Finkelstein-Schoenfeld method is a non-parametric method aiming to bring the weighting into the analysis of the composite endpoints. Finkelstein-Schoenfeld's method was named after their paper in 1999 in Statistics in Medicine "Combining Mortality and Longitudinal Measures in Clinical Trials".  The method was a generalization of the Gehan‐Wilcoxon test based on pairwise comparison of patients on a primary outcome when possible but otherwise on a secondary outcome. The Finkelstein-Schoenfeld method was originally proposed for "analyzing the impact of treatment which combines a (possibly censored) event with a longitudinal measure of clinical effect", not explicitly for analyzing the composite endpoint. 

Based on the Finkelstein-Schoenfeld method, Pocock and colleagues suggested an estimate, the Win Ratio, which summarized the ratio of the number of patients who fared better versus worse on the experimental arm. The Win-Ratio method was proposed explicitly for analyzing the composite endpoint (Pocock et al 2012) "The win ratio: a new approach to the analysis of composite endpoints in clinical trials based on clinical priorities".

With Finkelstein-Schoenfeld or Win-ratio method, pairwise comparisons are performed and the scores are calculated based on the comparison of the importance of the outcome. For example, for a study with composite endpoint including death and hospitalization, all patients had multiple pairwise comparisons performed, first with respect to time to death and to hospitalization, if the latter occurred. 

Below are some additional references discussing the Finkelstein-Schoenfeld or Win-Ratio method and their applications.  
There are several pivotal studies where the Finkelstein-Schoenfeld method is used to analyze the primary efficacy endpoint. The study protocol and statistical analysis plan posted online contain the detail descriptions about the application of the Finkelstein-Schoenfeld method. 
In the protocol / statistical analysis plan for the Partner trial, there are the following descriptions for the Finkelstein-Schoenfeld method: 

In PARTNER Trial was the basis for FDA approval of Vyndaqel and Vyndamax and Finkelstein and Schoenfeld's method was mentioned in the product label
"The primary analysis used a hierarchical combination applying the method of Finkelstein-Schoenfeld (F-S) to all-cause mortality and frequency of cardiovascular-related hospitalizations, which was defined as the number of times a subject was hospitalized (i.e., admitted to a hospital) for cardiovascular-related morbidity. The method compared each patient to every other patient within each stratum in a pair-wise manner that proceeded in a hierarchical fashion using all-cause mortality followed by frequency of cardiovascular-related hospitalizations when patients could not be differentiated based on mortality."
Hodges-Lehmann estimate is used in totally different situations, but similar to the Finkelstein-Schoenfeld method, the estimate relies on the pairwise comparison. While the Finkelstein-Schoenfeld method is primarily used in the analysis of composite endpoint,  Hodges-Lehmann estimate is mainly used to obtain the treatment difference for a continuous variable with normality assumption violation and non-parametric method being used.
 
With the Hodges-Lehmann method, the treatment difference is calculated for each pair for total n x m pairs (where n and m are the # of subjects in each treatment group). The Hodges-Lehmann estimate is the median of differences from all pairs.  
Hodges-Lehmann estimate has been used in many clinical trials that result in FDA approval of the products. For example, Hodges-Lehmann estimate was the method used in the SIROCCO trial in Asthma. The FDA statistical review document stated the primary analysis method of the study: 
"The primary analysis for the OCS percent reduction endpoint used the Wilcoxon rank-sum test approach. The primary analyses were performed in the FAS population. For each of the two Benralizumab dose regimen groups, the median difference in the OCS percent reduction between Benralizumab dose regimen and placebo was derived using asymptotic Hodges-Lehmann estimation, together with associated 95% CI and p-value. The same analyses were also performed for the EHS without multiplicity control." 

6 comments:

  1. Very helpful post, thanks! Do you have any idea about sample size and power calculation using the Finkelstein-Schoenfeld method?

    ReplyDelete
  2. You can take a look at the PARTNER trial protocol - the sample size section. It said the following about the sample size calculation and discussed the details about the sample size estimation using simulation method:

    "It remains to consider the power of the co-primary endpoint that uses the FinkelsteinSchoenfeld methodology. The first patient comparison in this test is survival; the increase in power over the survival test comes from the additional comparisons based on recurrent hospitalization. Based on the assumptions outlined below, we estimate that the power for the Finkelstein-Schoenfeld test will be at least 95%. Because the test is not considered in standard sample size software, these values are obtained by simulation. "

    ReplyDelete
  3. Really, an excellent summary, Dr. Deng.

    Thank you for providing this to the community.

    ReplyDelete
  4. Anonymous12:34 AM

    Dr. Deng, thank you for your summary! However, I am wondering that if the F-S test should be based on the (n+m)*(n+m) pairwise comparison or n*m comparison when you mentioned this "Well, both methods are based on the pairwise comparisons - the value/outcome from each subject in treatment group A is compared to each of all subjects in treatment group B - in other words, both methods are based on n (# of subjects in treatment group A) time m (# of subjects in treatment group B) comparisons. ". Because from the screenshot you share, it looks like all the comparisons were based on all subjects versus all subjects, which is also shown in many other paper. The importance of my question is that it will influence the variance estimates, i.e., the denominator of F-S statistics.

    ReplyDelete
  5. we are comparing each subject in group A with each subject in Group B, we are not going to compare the subjects within the same group - there number of pairs should be n*m, not (n+m)*(n+m).

    ReplyDelete
  6. Anonymous7:52 AM

    In fact I have the same comments as the other "Anonymous" commenter.

    My reading of several articles, and some R/SAS programs, indicate confusion between n*m and (n+m)*(n+m). My current understanding (which could be incorrect) is that for the actual win ratio, n*m pairs are used, and for the FS-statistic and p-value the (n+m)*(n+m) pairs are used. However, a confirmation by anyone who knows for sure would be greatly appreciated.

    ReplyDelete