On Biostatistics and Clinical Trials: Use of SF-36 in Clinical Trials

The SF-36 is a multi-purpose, short-form health survey with 36 questions. SF-36 is one of the most popular instruments for generic health surveys and it can be used across age, disease, and treatment group, and are appropriate for a wide variety of applications. Conversely to generic health surveys, disease specific health surveys are focused on a particular condition or disease. In clinical trials, SF-36 remains as one of the most common instruments for assessing the Health Related Quality of Life (HQOL), especially in diseases where there is no valid disease-specific tool.

SF-36 yields an 8-scale profile of functional health and well-being scores (so called domain scores) as well as psychometrically-based physical and mental health summary measures [physical component summary (PCS) and mental component summary (MCS)] and a preference-based health utility index (question #2).

The mapping from the original questions -> 8 domains -> PCS or MCS is sketched in the diagram below. Notice that only 35 out of 36 questions are used in this diagram. The question 2 asks about the general health status and does not contribute to the calculation of domain scores and component summaries. A good use of question 2 is to use its responses as anchor in identifying the minimal clinically important difference (MCID). In one of our publications in J Neurol Neurosurg Psychiatry, we indeed used this approach to identify the MCID.

For these 36 questions, the response categories vary depending on the question. The response categories range from 2 (yes, no) to 6 (all of the time, most of the time, a good bit of the time, some of the time, a little of the time, none of the time). Therefore, in order to calculate the domain score, a scoring method or algorithm has to be employed. For PCS and MCS, the calculation will be based on equations with coefficients from the regression models generated from the General Healthy Populatoin. In US, it is the Healthy General US Population. If different healthy population is used, the factor score coefficients for the Z_scores will be different and PCS and MCS values will be different.

The details about scoring method can be found at QualityMetric’s website. The scoring and calculation of component summaries require the programming. Some of the example programs (but not validated) can be found from the web:

Some questions and answers on using SF-36 in clinical trials:

Q: Is SF-36 free for using in clinical trials?

A: It is not free. License has to be obtained for using in industry-sponsored clinical trials. See qualitymetric website for detail.

Q: Why do we have question #2 that is used in calculation of any domain score and component summary?

A: It can be used as an assessment of general health status and also as an anchor for identifying MCID.

Q: Which general health population should be used for norm-based scoring?

A: The advantage of norm-based scoring is to facilitate the comparisons. If a study is a US domestic study, General Healthy US Population should be used. If it is an international study, the country-specific General Healthy Populations are preferred. SF-36 has been validated in many languages.

Q: What will be language to describe the statistical analysis plan for SF-36

A: For study protocol or for journal article statistical method section, analysis plan for SF-36 should be kept simple. In one of our publications on SF-36, we simply said:

"The corresponding physical component summary and mental component summary values for the randomized participants were calculated using the reported means, SDs, and factor score coefficients that came from the healthy general US population in 1990. A linear T-score transformation method was used so that both the physical component summary and the mental component summary scores were standardized with a range of 0 (lowest) to 100 (highest)"

Q: Could SF-36 be used in cost utility analysis?

A: No. SF-36 is not a utility score. However, Sf-36 can be converted to utility score (such as EQ-5D). See my previous blog

Q: Could we have one overall score for SF-36?

A: No. PCS and MCS have to be analyzed separately. You can not add PCS and MCS to have a single overall score.

Q: How to analyze the domain scores and component summaries?

A: Typically, 8 domain scores and 2 component summaries can be analyzed separately using analysis of variance or analysis of covariance or other methods such as repeat measurement depending on the study design.

A good approach in analyzing the SF-36 is to compare the each domain score with the General Healthy Population to show how much difference between the patients in the study and the General Healthy Population for pre-treatment and for end treatment visits. This approach was utilized in our SF-36 publication in Neurology.

4 comments:

Anonymous2:22 PM
Clasical statistical approaches, such as linear multiple regression, analysis of variance or analysis of covariance, might be not appropiated to analyze some of the eight dimensions of the SF-36. HRQoL outcomes tend to be not normally
distributed, skewed and bounded. Moreover, some of the dimmensions can take only a few different values so transformations to obtain a normal distribution are not feasible.

These two articles performed an excellent review about this topic:

Arostegui I, Núñez-Antón V, Quintana JM. Statistical approaches to analyse patient-reported outcomes as response variables: An application to health-related quality of life. Stat Methods Med Res. 2010 Sep 21. [Epub ahead of print]

Arostegui I, Núñez-Antón V, Quintana JM. Analysis of the short form-36 (SF-36): the beta-binomial distribution approach. Stat Med. 2007 Mar 15;26(6):1318-42.
Anonymous9:53 PM
Can the Qualitymatrix software that is included when you purchase SF36 be used to calculate between groups effects in a clinical trial? I can see how to calculate change from one measurement point to another for individual participants but i cannot work out how to do significance tests between means for the whole group at the different measurement points e.g. baseline, end of treatment and end of follow-up. These issues are not addressed in the instruction manual. I would appreciate any pointers you may have on this topic.
Web blog from Dr. Deng3:43 PM
I doubt that the software from Quality matrix can calculate the group effects in a clinical trial. The software from Quality matrix is only a scoring tool to convert the ordinal responses from the original questions to a standardized scale (0-100) - a continuous variable. To obtain groups effects, additional statisitcal analysis software (such as SAS) will be needed.
Unknown1:19 PM
1) What you said about linear T-score transformation ("in one of our publications on SF-36") is not quite correct in saying the T-score range is 0-100. T-scores have a mean of 50 and SD of 10. The range is different than the 0-100 possible range associated with putting scores on a percent of total possible range.
2) SF-36 can be used in cost-utility analysis if the SF-6D preference-based scored is employed.
3) The SF-6D provides "one overall score for SF-36."

Friday, March 11, 2011

Use of SF-36 in Clinical Trials

4 comments: