## Monday, December 27, 2010

### Bootstrap and SAS

In statistics, bootstrapping is a resampling technique used to obtain estimates of summary statistics. In clinical trials, bootstrapping technique could be a useful approach in obtaining the precision of an estimator. Most common application of the bootstrapping technique may be in obtaining the confidence interval for an estimator while the typical way of obtaining the confidence interval through the standard error approach is impossible or difficult.

Here are two examples that the bootstrapping technique needs to be implemented. The first example is for a manuscript. When we submitted our paper to European Respiratory Journal, one of the reviewer comments was a request for evaluating the internal consistency. The comment says “The statistical method is sample-based as it consists in a regression performed on this sample. Such a method needs at least evaluation for internal consistency (by measuring the regression correlation on a subsample then validating on another subsample or better by using bootstrap and jackknife methods).”

The second example is a request from the regulatory agency for calculating the 95% CI for % relative dif
ference. When there are two treatment means: A and B; % relative difference is defined as %RD= (A-B)/A. There may be other approaches in this case, but bootstrapping technique could come handy in calculating the 95% CI for %RD.

Bootstrap can be easily implemented in SAS and it contains three main steps: 1) resample the data from the observed data set (observed data is only one sample) – SAS Proc Surveyselect can serve this purpose 2) obtain the statistics (or estimator) by performing the analysis for each sample / resample 3) perform the summary statistics from the collection of the statistics or estimator.

Bootstrap is a suggested statistical approach for obtaining the confidence interval for individual and population bioequivalence criteria.

Some good references about how to do bootstrapping using SAS are included here:

Ten years ago, I had to use a SAS macro to do the bootstrap for my PhD dissertation. The macro is still there on SAS website.

Bootstrap technique has also been built into several SAS procedures (such as Proc Multtest, Proc MI).

When bootstrap is used in regression situation, 'Bootstrap Pairs' technique may be employed. Freedman (1981) proposed to resample directly from the original data: that is, to resample the couple dependent variable and regressor, this is called bootstrapping pairs.  Bootstrap pairs is described in a paper by Flachaire. The SAS macro for bootstrapping discussed two main ways to do bootstrap resampling for regression models, depending on whether the predictor variables are random or fixed.If the predictors are random, you resample observations just as you would for any simple random sample. This method is usually called "bootstrapping pairs". If the predictors are fixed, the resampling process should keep the same values of the predictors in every resample and change only the values of the response variable by resampling the residuals.