Many of the journal articles used Hodges-Lehmann estimator to the difference in two medians
In a study by Perkins et al "A Randomized Trial of Epinephrine in Out-of-Hospital Cardiac Arrest",
"The Hodges–Lehmann method was used to estimate median differences with 95% confidence intervals for length-of-stay outcomes"In a study by Devinsky et al "Trial of Cannabidiol for Drug-Resistant Seizures in the Dravet Syndrome"
"Analysis of the primary end point was performed with the use of a Wilcoxon rank-sum test. An estimate of the median difference between cannabidiol and placebo, together with the 95% confidence interval, was calculated with the use of the Hodges–Lehmann approach. Sensitivity analyses of this primary end point were prespecified in the trial protocol and statistical analysis plan"Similarly, Hodges-Lehmann estimator was used to estimating the treatment effect in licensure trials:
FDA Clinical/Statistical Review for Vascepa (icosapent ethyl) for reduction of triglycerides in patients with very high triglycerides
The median differences between the treatment groups and 95% CIs were estimated with the Hodges-Lehmann method. P-value is from the Wilcoxon rank-sum test.FDA Statistical review for RLY5016 for Oral Suspension (Veltassa) for Hyperkalemia
To compare Veltassa with placebo, the difference between the mean ranks was tested using a two-sided t-test. The difference and 95% CI between the treatment groups in median change from baseline was estimated using a Hodges-Lehmann estimator.FDA Medical Review of Oral Treprostinil for Pulmonary Arterial Hypertension
The magnitude of the treatment effects was defined by the Hodges-Lehmann method to estimate the median difference between treatment groups for the change from baseline in 6MWD.It sounds like we have found a solution to estimate the difference in medians when the data is not normally distributed. However, if we look at how the Hodges-Lehmann is calculated, we will see that it is not accurate to say the Hodges-Lehmann estimator is to compare the difference in medians, it is actually the estimator of the location shift (the term originally used by the authors) or the estimator of the median of differences (further explained below).
- Hodges-Lehmann (1963) Estimates of Location Based on Rank Tests
- SAS Hodges-Lehmann Estimation of Location Shift
- R Tools "Hodges Lehmann: Hodges-Lehmann Estimator of Location"
- How to interpret the estimate from Wilcoxon signed rank paired test?
- Lingling Han (2008) Calculating the point estimate and confidence interval of HodgesLehmann's median using SAS® software
- Elise Coudin and Jean-Marie Dufour, McGill University (2009): Hodges-Lehmann sign-based estimators and generalized confidence distributions in linear median regressions with heterogenous dependent errors
Let's check how medians are calculated using a very simple example:
Median and the difference in Medians:
|
Group A
|
Group B
|
Original Measures
|
4, 7, 5, 3, 6
|
3, 2, 5, 1, 4
|
Rank the original measures in order
|
3, 4, 5, 6, 7
|
1, 2, 3, 4, 5
|
Median
|
5
|
3
|
The difference in Medians (A-B)
|
2
|
|
Group A
|
Group B
|
Original Measures
|
4, 7, 5, 3, 6
|
3, 2, 5, 1, 4
|
Rank the original measures in order
|
3, 4, 5, 6, 7
|
1, 2, 3, 4, 5
|
Each number in Group A is compared to
each number in Group B
|
3 is compared to numbers in Group B: 2, 1, 0, -1, -2
4 is compared to numbers in Group B: 3, 2, 1, 0, -1
5 is compared to numbers in Group B: 4, 3, 2, 1, 0
6 is compared to numbers in Group
B: 5, 4, 3, 2, 1
7 is compared to numbers in Group
B: 6, 5, 4, 3, 2
|
|
Rank the differences from these pair
comparisons in order
|
-2, -1, -1, 0, 0, 0, 1, 1, 1, 1, 2, 2,
2, 2, 2, 3, 3, 3,
3, 4, 4, 4, 5, 5, 6
|
|
Hodges-Lehmann estimator of location
shift
|
Median of all these differences, in
this case, the Hodges-Lehmann estimator is 2
|
The calculations of the medians can be implemented in the following SAS codes:
data HodgesLehmann;
input group $ number @@;
datalines;
A 3 A 4 A 5 A 6 A 7
B 1 B 2 B 3 B 4 B 5
;
proc means data=hodgeslehmann median maxdec=0;
class group;
var number;
run;
proc npar1way data=hodgeslehmann hl;
class group;
var number;
run;
The Hodges-Lehmann estimation of the location shift is confirmed to be 2. In this example, the Hodges-Lehmann estimation of the location shift (2) is exactly the same as the differences in two medians (5-3 = 2).
However, in many situations, the Hodges-Lehmann estimation of the location shift will be different from the differences between the two medians. the Hodges-Lehmann should really be called the median of differences between the two groups or the location shift (as the original authors used).
The example below shows that the Hodges-Lehmann estimation of the location shift can be very different than the differences between the two medians.
|
Group A
|
Group B
|
Original Measures
|
50.6, 39.2, 35.2, 17.0, 11.2, 14.2, 24.2,
37.4, 35.2
|
38.0, 18.6, 23.2, 19.0, 6.6, 16.4, 14.4,
37.6, 24.4
|
Rank the original measures in order
|
11.2
14.2
17.0
24.2
35.2
35.2
37.4
39.2
50.6
|
6.6
14.4
16.4
18.6
19.0
23.2
24.4
37.6
38.0
|
Median
|
35.2
|
19.0
|
The difference in Medians (A-B)
|
16.2
|
data HodgesLehmann2;
input Group $ number@@;
datalines;
A 50.6
A 39.2
A 35.2
A 17.0
A 11.2
A 14.2
A 24.2
A 37.4
A 35.2
B 38.0
B 18.6
B 23.2
B 19.0
B 6.6
B 16.4
B 14.4
B 37.6
B 24.4
;
proc means data=hodgeslehmann2 median maxdec=1;
class group;
var number;
run;
proc npar1way data=hodgeslehmann2 hl;
class group;
var number;
run;
As illustrated above, the Hodges-Lehmann estimation of the location shift is 7.8, however, the difference between two medians is 35.2 - 19.0 = 16.2 (the median for groups A is 35.2 and the median for Group B is 19.0).
While the Hodges-Lehmann estimator is often used to measure the treatment difference when the data is not normally distributed, we need to understand how the Hodges-Lehmann is calculated and how Hodges-Lehmann estimator can be very different than the simple difference between two medians.