On Biostatistics and Clinical Trials: April 2022

Monday, April 25, 2022

Estimands, Estimator, Estimate, and Estimation

With the adoption of ICH E9(R1) "ADDENDUM ON ESTIMANDS AND SENSITIVITY ANALYSIS IN CLINICAL TRIALS TO THE GUIDELINE ON STATISTICAL PRINCIPLES FOR CLINICAL TRIALS", the word 'Estimand' or 'Estimands' is appearing more and more in the clinical trial protocols, statistical analysis plans, regulatory documents, and publications. Just like the 'intention-to-treat' or 'ITT', the term 'Estimand' starts initially as a bizarre and confusing word, but will eventually be understood by clinical researchers who are working on clinical trials. Along with the word 'estimand', the words 'estimate', 'estimator', and 'estimation' should all be distinguished.

According to ICH E9(R1) "ADDENDUM ON ESTIMANDS AND SENSITIVITY ANALYSIS IN CLINICAL TRIALS TO THE GUIDELINE ON STATISTICAL PRINCIPLES FOR CLINICAL TRIALS", they are defined as the following:

Estimand: A precise description of the treatment effect reflecting the clinical question posed by the trial objective. It summarises at a population-level what the outcomes would be in the same patients under different treatment conditions being compared.

Estimator: A method of analysis to compute an estimate of the estimand using clinical trial data.

Estimate: A numerical value computed by an estimator.

Estimation is the process of finding an estimate or approximation, which is a value that is usable for some purpose even if input data may be incomplete, uncertain, or unstable.

For a clinical trial,

estimand (trial objective) is the target of estimation (for example, the treatment difference in change from baseline to Week 26 in FEV1)
estimator is the method of estimation (such as ANCOVA, Logistic regression, Cox regression)
estimate is the numerical result (such as LS mean difference, odds ratio, hazard ratio and their 95% confidence interval).

Estimate is a rough calculation or guess while estimand is (statistics) that which is being estimated. p-value is from the statistical test is not an estimate (debatable) and naked p-values without the associated estimates should be explained with caution.

The primary estimand (a precise description of the treatment effect reflecting the objective of the clinical trial) assessed effects regardless of treatment discontinuation or rescue interventions.

Using a cake as an example, estimand is the cake we want to make (what you seek), estimator is the recipe for making the cake (how you will get there), and estimate is the final result (what you get) - the cake we actually make - the estimate (final cake) should be close to the estimated (the cake we want to make). We turn our estimand into our estimate by applying an estimator.

In a paper by Little and Lewis (JAMA, 2021) Estimands, Estimators, and Estimates, the estimated and the estimator were compared:

The estimand compares outcomes that capture the main benefits and risks of treatments
Estimators should summarize the causal effects of treatments in the sample of individuals in the study
Estimands should summarize the causal effects of treatments in the target population
The estimator should provide a valid and unbiased estimate of the study estimand

The paper was in response to the PIONEER 3 study: "Effect of Additional Oral Semaglutide vs Sitagliptin on Glycated Hemoglobin in Adults With Type 2 Diabetes Uncontrolled With Metformin Alone or With Sulfonylurea - The PIONEER 3 Randomized Clinical Trial". In PIONEER 3 trial, the primary estimand is the treatment policy estimand evaluating the treatment effect in change in glycated hemoglobin (HbA1c) from baseline to week 26 for all randomized patients regardless of trial product discontinuation or use of rescue medication. The estimator is to use analysis of covariance with a pattern mixture model using multiple imputation to handle missing data assuming the missing data mechanism is missing at random (MAR). The estimate is the actual value of the treatment differences as expressed as the estimated treatment differences and their 95% confidence intervals: "The 7- and 14-mg/d semaglutide dosages were superior to sitagliptin in reducing HbA1c from baseline at week 26 (estimated treatment differences of –0.3% [95% CI, –0.4% to –0.1%; P < .001] and –0.5% [95% CI, –0.6% to –0.4%; P < .001], respectively)"

In a Sanofi's trial "A randomized, double-blind, placebo-controlled study to evaluate the efficacy and safety of dupilumab in patients with severe steroid-dependent asthma", the primary estimand and the estimator (how the primary efficacy endpoint is analyzed) were discussed:

The primary estimand is the intent-to-treat estimand, the treatment difference between dupilumab and control in the mean percentage reduction of OCS dose at Week 24 while maintaining asthma control of all patients in the ITT population no matter whether the patients discontinue treatment before Week 24 or not. To estimate the estimand, data of patients who permanently discontinue treatment will be incorporated in the primary analysis, and missing data due to patients dropping out from study will be handled by approaches specified in the missing data handling section below.

The primary efficacy endpoint will be analyzed using an analysis of covariance (ANCOVA) model. The model will include the percentage reduction of OCS dose at Week 24 as the response variable, and the treatment groups, optimized OCS dose at baseline, regions (pooled countries), and baseline eosinophil level subgroups (less than 0.15 Giga/L, greater than or equal to 0.15 Giga/L) as covariates. The treatment difference will be tested at the 2-sided significance level of alpha=0.05. Descriptive statistics for the primary efficacy endpoint will be provided, including the number of patients, means, standard errors, and least squares (LS) means by the treatment groups, as well as the difference in LS means and the corresponding 95% confidence interval (CI). The missing data (missing measures at Week 24) will be imputed using pattern mixture model by multiple imputation.

In a paper by Harvard researchers, "An Applied Researcher’s Guide to Estimating Effects From Multisite Individually Randomized Trials: Estimands, Estimators, and Estimates", the following conclusions are given:

"Defining an estimand is critical to the design, analysis, and interpretation of a multisite RCT. Even when one is interested in estimating an average treatment effect, careful consideration must be given to defining the target of inference. For example, the choice of estimand influences what formula is appropriate when conducting power calculations - yet many write-ups of power calculation are ambiguous or silent with regards to the estimand chosen. relatedly, registering studies and creating analysis plans are becoming the norm when conducting RCTs, yet even newly created registries do not require an estimand be defined. Consequently, readers of an analysis plan are left to assume an implied estimand based on the power calculation formula used and/or estimator selected. Similarly, many scholarly articles and reports do not state a target of inference, making it difficult to assess whether the chosen estimator is appropriate and obscuring the goal of the research. We recommend defining the estimand(s) early in the research process and clearly stating them in important products."

Monday, April 11, 2022

Randomization and elements of randomization specifications

Randomization is the process of assigning trial subjects to treatment or control groups using an element of chance to determine the assignments to reduce bias. Randomization is the most critical feature of the RCT (randomized, controlled trials). In FDA's Good Review Practice: Clinical Review of Investigational New Drug Applications, the randomization is defined as the following:

In the context of clinical trial design, randomization is defined as the allocation of patients to the investigational drug and control arms by chance. Randomization is intended to prevent any systematic difference between patients assigned to the treatments being compared and is a critical assumption for valid statistical comparisons. It is also intended to produce groups that are comparable (statistically balanced) with respect to both known and unknown factors.

Randomization Schedule (also called a randomization scheme) is a list of randomization numbers and the corresponding treatment assignments in a data set (or in a printout in the early days). Randomization Schedule can be generated using SAS Proc Plan. A SUGI paper"Generating Randomization Schedules Using SAS Programming" I wrote 20 years ago is still applicable.

Three steps for generating the randomization schedule for use in clinical trials:

Create randomization specifications according to the study protocol requirements
Create and validate dummy randomization schedule for review and approval
Create and validate the final randomization schedule - the final randomization schedule for implementation

The dummy randomization schedule and the final randomization schedule have the same display but are generated with different random seeds (therefore different treatment assignments). The dummy randomization schedule can be reviewed by the study team and the final randomization schedule can only be distributed to the designated recipients who are unblinded to the treatment assignments.

Here is an example randomization specification:

Here are the elements for the randomization specifications:

Study Design: clinical trial design dictates how the subjects are assigned to receive different study treatments. In clinical trials with parallel design, subjects are randomized to receive the treatments; in clinical trials with cross-over design, subjects are randomized to different treatment sequences.

The study design will also include a randomization strategy:

Fixed randomization:

fixed-randomization scheme (rarely used)
block randomization
stratified randomization,

Dynamic randomization:

Adaptive randomization.

See FDA's Good Review Practice: Clinical Review of Investigational New Drug Applications for definitions of these different types of randomizations.

Blind and blindness: concealing treatment assignments and treatment allocations.

Block: Block randomization works by randomizing subjects within blocks such that within each block, the # of subjects is balanced between treatment groups or according to the randomization ratio.

Block Size: The size of each block. Block sizes must be multiples of the number of treatments and take the allocation ratio into account. For 1:1 randomization of 2 groups, blocks can be sizes 2, 4, 6 etc. For 1:1:1 randomization of 3 groups or 2:1 randomization of 2 groups, blocks can be sizes 3, 6, 9 etc.

If the randomization is by site, to prevent the potential unblinding/guessing, the block size can be set up as variable for different blocks or is not revealed to the investigators and study team. With central randomization, potential unblinding is less of a concern and the block size can be the smallest multiples (for 1:1 randomization of 2 groups, the block size can be 2).

Number of Blocks

Total Number of Randomizations: Total number of randomization numbers to be generated. Total number of randomizations = Number of blocks x Block size. Usually, randomization numbers more than the protocol-specified sample size are generated to make sure that there is a sufficient number of randomizations in the situation that the sample size may be increased or randomization errors that results in some randomization numbers not being used. If the study protocol specifies 300 subjects to be randomized, it may be good to generate 600 randomizations.

Strata and Stratification Factors: Stratification factors are those known factors that may have an impact on treatment responses. Stratification factors are the known confounders. The most common stratification factor is the baseline disease severity which usually has an impact on the treatment responses. When stratification factors are specified, stratified randomization is employed to prevent imbalance between treatment groups for known factors that influence prognosis or treatment responsiveness. The randomization schedule is essentially generated for each stratum.

See previous posts "Restricted randomization, stratified randomization, and forced randomization"; "Minimization Algorithm to Achieve Treatment Balance across Strata in Stratified Randomization", and "Handling Randomization Errors in Clinical Trials with Stratified Randomization"

Randomization Ratio (or allocation ratio): The ratio for treatment groups. The typical randomization ratio is balanced: 1:1 ratio for two treatment groups (if the block size is 2, for every 2 subjects randomized, there will be one assigned to group A and one assigned to group B); 1;1:1 ratio for three treatment groups, ... The randomization ratio can also be unbalanced such as 2:1 (if the block size is 3 (minimal), for every three subjects randomized, there will be two assigned to group A and one assigned to group B) and 3:1,... FDA's Good Review Practice: Clinical Review of Investigational New Drug Applications described the randomization ratio (allocation ratio) as the following:

Allocation of patients to treatment and control arms can be uniform or nonuniform. Uniform allocation (i.e., equal numbers allocated to each arm) is the usual practice and provides the most statistical power for a given total sample size. Nonuniform allocation may lower costs (if one arm is substantially more expensive) and improve recruitment (if one arm is generally preferred) and may increase the size of the exposed patient safety database. In general, the loss of statistical power in seeking to detect a difference between treatments going from uniform allocation to 2:1, or even 3:1, is fairly small; however, as more imbalanced allocation occurs, power drops off more rapidly. A special case is where a trial seeks both to show effectiveness versus placebo and to compare the test drug with an active control. In that case, it usually is necessary for the active treatment groups to be substantially larger to examine the smaller differences between the active treatments.

Randomization Number: a series of sequential numbers corresponding to treatment assignments. 'randomization number' is not random, the associated treatment assignments are random.

Randomization can be recorded in the database and serve as the subject identifier (same as the subject number). Seeing the randomization number will not unblind the subject's treatment assignment.

Treatment Code: short description or abbreviation for long treatment descriptions. Treatment code can be just the letters (such as A = Active; P = Placebo).

Treatment Description: the detailed description of the treatment groups. It can be just 'Active', 'Placebo' or more descriptive as "Inhaled drug X BID', 'Inhaled Placebo BID'.

Dummy Randomization Schedule: also called surrogate randomization schedule - the randomization schedule for review and approval purposes. The dummy randomization schedule should have exactly the same features as the final randomization schedule except that a different random seed is used (therefore, the treatment assignments are different).

Random Seed: A number (integer) used to initiate a pseudorandom number generator. Random Seed is a number used in SAS Proc Plan to generate the randomization schedule. Random Seed needs to be specified in the program in order to reproduce the same randomization schedule.

After dummy randomization schedule is reviewed and approved, a final randomization schedule can be generated for implementation by changing the random seed.

Randomization Envelopes: Envelopes that contain the treatment assignment information. The outside of the envelope contains the randomization number, and the inside of the envelope contains the randomization number and the corresponding treatment assignments and treatment descriptions. Randomization envelopes were used in the randomization process in the early days. The randomization process using randomization envelopes is now replaced with the Interactive Response Technology (IRT) including the Interactive Web Response System (IWRS) or Interactive Voice Response System (IVRS).

Central Randomization is the opposite of randomization by the site. When a subject is eligible to be randomized, the site will contact a centralized contact (usually the computer system, IRT) to obtain the next available randomization number in the corresponding stratum regardless of the individual sites.

References: SAS/STAT User's Guide The PLAN Procedure

Monday, April 04, 2022

Common Issues in Implementing Randomization and Blinding

Randomization and blinding are two techniques to help prevent (conscious or unconscious) bias in clinical trials and they are the cornerstone of the randomized, controlled clinical trials (RCTs) and in FDA's terms, the cornerstone of the adequate & well-controlled clinical trials (A&WCs). As stated in FDA's Good Review Practice: Clinical Review of Investigational New Drug Applications:

Randomization and blinding are the two principal means of reducing bias and ensuring validity of trial conclusions. Randomization helps protect against the possibility that differences between groups at baseline will lead to outcome differences that might mistakenly be attributed to drug effect. Blinding protects against the possibility that differences in the on-trial treatment or assessment of subjects will lead to spurious outcome differences that are mistakenly attributed to a drug effect.

In the context of clinical trial design, randomization is defined as the allocation of patients to the investigational drug and control arms by chance. Randomization is intended to prevent any systematic difference between patients assigned to the treatments being compared and is a critical assumption for valid statistical comparisons. It is also intended to produce groups that are comparable (statistically balanced) with respect to both known and unknown factors.

While every effort is made to prevent the mistakes in implementing the randomization, it is inevitable to have the randomization errors and mistakes here and there. Here are some of the randomization errors that may be seen in clinical trials.

Ineligible subjects are randomized: Clinical trials contain a screening period for verifying the eligibility of the study participants. All inclusion and exclusion criteria are checked during the screening period. If a subject meets all inclusion and exclusion criteria, the subject is eligible to be randomized. Sometimes, subjects are thought to be eligible for randomization, but only, later on, are found to be ineligible for one or more entry criteria. When ineligible subjects are randomized into the study and receive the assigned study treatments, the subjects are considered to be in the study. According to the intention-to-treat principle, the subjects will be included in the analyses regardless of the violation of the inclusion or exclusion criteria. If the critical criteria that are violated have an impact on the efficacy evaluation, the subjects may be excluded from the per-protocol population and sensitivity analyses are conducted with the per-protocol population to assess the robustness of the results from the primary analysis.

Choosing the wrong stratum for randomization: For clinical trials with stratified randomization, the randomization is executed within each stratum. When a new subject is eligible to be randomized, the next available randomization number in the corresponding stratum (for example, based on the subject's gender, baseline disease severity category,...) is allocated to the subject.

It is not uncommon that the investigational sites to select an incorrect stratum for the randomization especially when the strata information requires additional derivation and calculation, for example, if a subject with or without using one class of background medications is a stratification factor, the information about the use one class of background medication may need to be derived.

If a randomization stratification factor is measured more than one time, which measure will be used for randomization needs to be clearly stated in the protocol. If a spirometry parameter (for example, % predicted FEV1 >= 50% versus <50%) is used as a randomization stratification factor and spirometry tests are performed at both screening and baseline visits, the protocol needs to be specific regarding whether the results from screening visit or the baseline visit will be used for randomization - typically, the measures at the baseline visit should be used for randomization. If a laboratory parameter is used for randomization and there are both local lab and central lab, the protocol needs to be specific regarding which lab results will be used for randomization - typically the central lab results at the baseline will be used for randomization unless the central lab results can not be obtained in time for the randomization.

See previous post "Handling Randomization Errors in Clinical Trials with Stratified Randomization"

Randomize the patients too early before all eligibility criteria are met: The investigator rushed to go to the randomization system (IRT) and randomized the subject to trigger the downstream activities, then realized that one or more screening results were still pending.

Once the subject is randomized, it can’t be undone in the randomization system (IRT system). However, the site can hold on to the randomization information obtained and wait for the last pieces of the screening results to confirm the eligibility. If the last piece of the screening results confirms that the subject is eligible to be randomized, the previously obtained randomization information will then be used. The subject can move on to initiate the assigned study treatment. If the last piece of the screening results indicates that the subject is ineligible to be randomized, we will then need to decide if the subject is allowed to be in the study. If so, it will become the situation mentioned in the previous section "Ineligible subjects are randomized".

In either situation, a protocol deviation needs to be recorded to document this incident.

The PI was practicing the randomization system to see how the randomization works but accidentally randomized the subject in a live system. In this situation, there was no actual and real subject to be randomized. The subject information entered into the randomization system was not real, but one of the randomization numbers was assigned and treatment assignment was used.

While this subject may remain in the randomization system (IRT), the subject is fake and should be removed from the downstream clinical database. This is usually a rare event, therefore, has no big impact on the integrity of the original randomization.

Randomization system (IRT system) is down at the time of randomization or the internet is down: sometimes, the randomization needs to be performed immediately after the last eligibility criterion is confirmed. It is critical to have immediate access to the randomization information in order to randomize the subject in time for initiating the randomized treatment. However, it could happen that the IRT system is down or the internet is down when the randomization number and treatment assignments are needed.

If this is a situation, the advice is to have a backup manual randomization system (for example, calling an unblinded person or group).

Dispense the incorrect drug kit: nowadays, the randomization system is embedded in the system for clinical trial supplies (IRT system). In addition to the treatment assignments, a separate drug kit list will be generated. When a subject is randomized and a treatment group is assigned, the drug kits that are corresponding to the assigned treatment will be allocated and dispensed to the subject.

See a previous post "Monitoring the double-blind study: unblinded pharmacist, unblinded monitor, and drug kit"

Due to human error, it is possible to have the correct randomization information but dispense the incorrect kit numbers. When this happens, it is adverse to verify with the clinical trial supply manager (who are unblinded) if the incorrect drug kit is for the same treatment group as the drug kit that is supposed to be dispensed (Don't communicate about the actual treatment group). It is less an issue if the incorrect kit numbers are in the assigned treatment group.

If the incorrectly dispensed drug kits are not in the assigned treatment group, the subjects received the incorrect treatment. For the statistical analysis, the subject will be included in the intention-to-treatment analysis and will be included in the randomized treatment group (so-called 'as randomized). The subject can be excluded from the per-protocol population for sensitivity analysis.

Recording the randomization date/time (local time versus backend system time): When a subject is randomized in the IRT system, a randomization message or printout, or randomization report will indicate the subject number, randomization number, the stratification factors used for randomization, randomization date/time. The investigator can record the randomization information in the case report form. Only blinded information can be included in the randomization report.

One issue for this report is the randomization date/time - is it based on the IRT system date/time or the local date/time? The local date/time should be used as the randomization date/time. If the IRT system is located in the UK and the subject is randomized in the US, the local and system time can differ in 5-8 hours. Local date/time, not the system date/time should always be used as the randomization date/time.

The same subject is randomized twice (unless it is the micro-randomized trial)

The majority of these randomization errors that occurred in the study were not included in the publications and regulatory submissions - the randomization issues appear to be less than what actually occurred. Some of the examples of the randomization errors can still be found in the literature:

This article by Downs et al "Some practical problems in implementing randomization" documented some situations with randomization errors.
in a Takeda study SAP, a section was included to describe the handling of randomization errors.

in the recent FDA's advisory committee meeting to discuss Amylyx pharmaceuticals' ALS drug, the randomization issues in their randomized trial (CENTRAUR trial) were discussed. Here is what in the presentations:

"To start, at the beginning of CENTAUR, a randomization implementation problem was identified and addressed by the unblinded statistician. Let’s walk through the details. In CENTAUR, kits were shipped one by one after successful screening visits. While preparing for the first Data Safety Monitoring Board meeting in November 2017, the unblinded statistician found that the initial 18 study kits shipped were all active. This was due to an error at the distribution center. They proceeded to instruct the distribution center to balance these 18 kits by shipping a block of 9 placebo kits to maintain randomization. After correction, the 2:1 active:placebo ratio was maintained. The unblinded statistician notified Amylyx of this issue in January 2020, two months after study unblinding in November 2019. Participants, investigators, and study staff were never unblinded due to this error. Upon notification, Amylyx initiated a thorough investigation of the root cause, in consultation with the unblinded statistician and the distribution center. Amylyx also consulted with external statisticians to determine the best approach to assess the impact. The statisticians recommended a sensitivity analysis to exclude the participants affected by the error."

in a paper by Douglas et al, "Fluid Response Evaluation in Sepsis Hypotension and Shock : A Randomized Clinical Trial", 4 subjects with randomization errors were excluded from the intention-to-treat analyses even though there was no mention what kind of randomization errors were.
in a paper by Martinez et al "Treatment of Persistent Cough in Subjectswith Idiopathic Pulmonary Fibrosis (IPF)with Gefapixant, a P2X3 Antagonist, in a Randomized,Placebo-Controlled Clinical Trial", a programming error in IWRS caused the randomization errors.

On Biostatistics and Clinical Trials