Wednesday, July 16, 2008

Communication with non-statisticians

My fellow colleague expressed his frustration about exlaining a statistical concept to our clinical operation colleagues. I fully undertood his feelings. Sometimes, it is not easy to communicate with non-statisticians. The problem could be on both sides: stastician did not use the plain English or non-statisticians lacked the understanding of very basic statistics.

Considering my medical background, I feel a little lucky when communicating with physicians or non-statisticians. Perhaps also because of my teaching experience, I knew how to explain the complicated statistical issues in plain language to the non-statisticians. So I have one area I am proud of myself.

Below I picked up an example to demonstrate how differently the statistical terms can be explained.

Regarding three types of missing data mechnisms, here are the definition from a recent article in Drug Information Journal:
  • Data are considered missing completely at random (MCAR) if, conditional upon the independent variables in the analytic model, the missingness does not depend on either the observed or unobserved outcomes of the variable being analyzed (Y)
  • Data are missing at random (MAR) if, conditional upon the independent variables in the analytic model, the missingness depends on the observed outcomes of the variable being analyzed (Yobs) but does not depend on the unobserved outcomes of the variable being analyzed (Ymiss).
  • Data are missing not at random (MNAR) if, conditional upon the independent variables in the analytic model, the missingness depends on the unobserved outcomes of the variable being analyzed.
Now, for the same concept, the following definitions seem to be better understandable.
  • MCAR (data are missing completely at random): A "missing" value does not depend on the variable itself or on the values of other variables in the database.
  • MAR (data are missing at random): The probability of missing data on any variable is not related to its particular value. The pattern of missing data is traceable or predictable from other variables in the database.
  • NMAR (not missing at random): Missing data are not random and depend on the values that are missing.

No comments: