Thursday, July 18, 2019

Retire Statistical Significance and p-value?


In the March issue of The American Statistician, there was a special issue with 43 papers about “Statistical Inference in the 21st Century: A World Beyond p < 0.05”. The discussion about using the p-values was picked up by the scientific communities and triggered a lot of discussions. Some of the articles were provocative: “Retire Statistical Significance”, “Abandon / Retire Statistical Significance”. American Statistician Association’s president, Karen Kafadar, has also discussed this issue in his ‘president’s corner’.

For a long time, statisticians have been cautioned about the misuse of the p-values.
  • Don’t become the slave of the p-values
  • Don’t base your conclusions solely on whether an association or effect was found to be “statistically significant” (i.e., the pvalue passed some arbitrary threshold such as p < 0.05).
  • Don’t believe that an association or effect exists just because it was statistically significant.
  • Don’t believe that an association or effect is absent just because it was not statistically significant.
  • Don’t believe that your p-value gives the probability that chance alone produced the observed association or effect or the probability that your test hypothesis is true.
  • Don’t conclude anything about scientific or practical importance based on statistical significance (or lack thereof).

The intention of the special issue is to trigger a healthy debate about the p-values and statistical significance, trigger the development of better methods, and provide the educations about the appropriate use and interpretation of the p-values. However, there is a danger of the unintended consequences: non-statisticians may be confused about what to do. Worse, “by breaking free from the bonds of statistical significance” as the editors suggest and several authors urge, researchers may read the call to “abandon statistical significance” as “abandon statistical methods altogether”.

The drug development relies on the clinical trials to demonstrate the substantial evidence about the efficacy and the substantial evidence comes from adequate and well-controlled investigations.
“evidence consisting of adequate and well-controlled investigations, including clinical investigations, by experts, qualified by scientific training and experience to evaluate the effectiveness of the drug involved, on the basis of which it could fairly and responsibly be concluded by such experts that the drug will have the effect it purports or is represented to have under the conditions of use prescribed, recommended, or suggested in the labeling or proposed labeling thereof”

For a common disease, two pivotal studies (with each showing a statistical significance at alpha = 0.05) have been the requirement for FDA (see FDA guidance "Providing Clinical Evidence of Effectiveness for Human Drug and Biological Products")

FDA has applied more flexibilities in evaluating the evidence for drugs/biological products for treating rare diseases especially those with unmet medical needs. Frank Sasinowski has two articles discussing this issue.
If we need to avoid the overuse and misuse of p-values, we will need to start with the changes in the statute of the laws and changes in regulatory science.

In addition, the scientific journals and editors may judge the value of a paper based on the significance of the results and favors the studies with statistical significance for publication. However, this may be changed now. On July 18. 2019 issue of New England Journal of Medicine (NEJM), an editorial paper was published "New Guidelines for Statistical Reporting in the Journal".
"The new guidelines discuss many aspects of the reporting of studies in the Journal, including a requirement to replace P values with estimates of effects or association and 95% confidence intervals when neither the protocol nor the statistical analysis plan has specified methods used to adjust for multiplicity."
With NEJM leading the way, other journals may follow. We will see more reporting of the confidence intervals and less reporting of the p-values. 

No comments: