Multiple Imputation by Chained Equations with Predictive Mean Matching for Cluster Randomized Trials using mice and miceadds in R

A journal article recently published in Clinical Trials caught my eye, How is missing data handled in cluster randomized trials? A review of trials published in the NIHR Journals Library 1997-2024. I immediately knew this was going to be a poor report card. Indeed, the abstract summarizes:

Among the 110 identified cluster randomized controlled trials, 45% (50/110) did not report or take any action on missing data in either primary analysis or sensitivity analysis. In total, 75% (82/110) of the identified cluster randomized controlled trials did not impute missing values in their primary analysis. Advanced methods like multiple imputation were applied in only 15% (16/110) of primary analyses and 28% (31/110) of sensitivity analyses. On the contrary, the review highlighted that missing data handling methods have evolved over time, with an increasing adoption of multiple imputation since 2017.

It is promising to see that adoption of multiple imputation is increasing, but almost half of all studied cluster trials did not address missing data at all! BIG YIKES?!? Honestly, I have to say I am not at all surprised. Missing data is hard. Cluster randomized trials are hard. Together…

Do you ever feel like you just can’t do anything right?! Well, *do I have* more problems for you…

Working on cluster randomized trials introduces a slew of additional considerations, even in the presence of “simpler” parallel two-arm designs with completely observed data. Do I have enough clusters in each arm to reliability estimate between-cluster variances? What about number of individuals per cluster to estimate within-cluster variances? What if I have repeated measures for each subject, so I have additional correlation to worry about? WHAT ABOUT MY TREATMENT EFFECT?!!! Should I be estimating cluster-level effects or individual-level effects? How do I incorporate cluster-level and individual-level covariates into my models? WHAT DO MY TREATMENT EFFECTS EVEN REPRESENT?!!!

If you are going to address missing data in your cluster trial, you need to make further considerations and assumptions. We can’t even evaluate most of these assumptions. Are my data MCAR, MAR, or MNAR? If I think my data is missing not at random, what assumptions do I feel I can reasonably make about the missing mechanism? What if I’m wrong? What are reasonable sensitivity analyses to perform? What auxiliary variables do I need to include in my imputation models? Should they be at the cluster-level or the individual-level? Perhaps most importantly, how do I properly capture clustering using multiple imputation?!!! Maybe I could borrow observed responses from similar participants in the same cluster… But what if my clusters are small? What if my clusters are large? Does my strategy need to change? (SPOILER ALERT: yes)

No wonder multiple imputation was only applied in 15% of studied primary analyses. Not to mention that two-level predictive mean matching, which I personally think is a very elegant and pragmatic approach to imputing ordinal data, was not even implemented in the add-on popular missing data R package “mice” until 2016. If you wanted to use existing tools before that, many would be faced with suboptimal imputation using a continuous model that doesn’t capture the bounded, discrete nuances of your data. Real talk: It’s also possible, particularly in the presence of relatively few missing observations, that the statisticians on some of these cluster trials “not addressing missing data” weighed the risk of bias due to the missing data against their confidence in required assumptions or inputs, or required effort to implement, and decided it was not worthwhile. Practical considerations are important too.

Anyhow, while I do think there is still a long way to go to make missing data methods for cluster trials accessible, the good news is that several, different two-level (level 1: individual; level 2: cluster) multiple imputation methods are now available in the popular and easy-to-use mice and add-on miceadds packages in R. This includes predictive mean matching! Yay! In this blog post, I will demonstrate how to impute missing ordinal data in a cluster randomized trial using two-level predictive mean matching via the mice and miceadds packages so you can avoid contributing to bad report cards! More yay – we love avoiding citation for bad practices!

Continue reading Multiple Imputation by Chained Equations with Predictive Mean Matching for Cluster Randomized Trials using mice and miceadds in R

Practical inference for win measures via U-statistic decomposition

Introduction

In a previous blogpost, I described how complex estimation of U-statistic variance can be simplified using a “structural component” approach introduced by Sen (1960). The structural component approach is very similar to the leave-one-out jackknife. Essentially, the idea behind both of these approaches is that we decompose the statistic into individual contributions. Here, these are referred to as “structural components,” and in the LOO jackknife, these are referred to as “pseudo-values” or sometimes “pseudo-observations.” Construction of these individual quantities differs conceptually somewhat, but in another blogpost, I discuss their one-to-one relationship for specific cases. We can then take the sample variance of these individual contributions to estimate the variance of the statistic.

Estimators for increasingly popular win measures, including the win probability, net benefit, win odds, and win ratio, are obtained using large-sample two-sample U-statistic theory. Variance estimators are complex for these measures, requiring the calculation of multiple joint probabilities.

Here, I demonstrate how variance estimation for win measures can be practically estimated in two-arm randomized trials using a structural component approach. Results and estimators are provided for the win probability, the net benefit, and the win odds. For simplicity, only a single outcome is considered. However, extension to hierarchical composite outcomes is immediate with use of an appropriate kernel function.

Continue reading Practical inference for win measures via U-statistic decomposition

Nonparametric neighbours: U-statistic structural components and jackknife pseudo-observations for the AUC

Two of my recent blog posts focused on two different, but as we will see related, methods which essentially transform observed responses into a summary of their contribution to an estimate: structural components resulting from Sen’s (1960) decomposition of U-statistics and pseudo-observations resulting from application of the leave-one-out jackknife. As I note in this comment, I think the real value of deconstructing estimators in this way results from the use of these quantities, which in special (but common) cases are asymptotically uncorrelated and identically distributed, to: (1) simplify otherwise complex variance estimates and construct interval estimates, and (2) apply regression methods to estimators without an existing regression framework.

As discussed by Miller (1974), pseudo-observations may be treated as approximately independent and identically distributed random variables when the quantity of interest is a function of the mean or variance, and more generally, any function of a U-statistic. Several other scenarios where these methods are applicable are also outlined. Many estimators of popular “parameters” can actually be expressed as U-statistics. Thus, these methods are quite broadly applicable. A review of basic U-statistic theory and some common examples, notably the difference in means or the Wilcoxon Mann-Whitney test statistic, can be found within my blog post: One, Two, U: Examples of common one- and two-sample U-statistics.

As an example of use case (1), Delong et al. (1988) used structural components to estimate the variances and covariances of the areas under multiple, correlated receiver operator curves or multiple AUCs. Hanley and Hajian-Tilaki (1997) later referred to the methods of Delong et al. (1988) as “the cleanest and most elegant approach to variances and covariances of AUCs.” As an example of use case (2), Andersen & Pohar Perme (2010) provide a thorough summary of how pseudo-observations can be used to construct regression models for important survival parameters like survival at a single time point and the restricted mean survival time.

Now, structural components are restricted to U-statistics while pseudo-observations may be used more generally, as discussed. But, if we construct pseudo-observations for U-statistics, one of several “valid” scenarios, what is the relationship between these two quantities? Hanley and Hajian-Tilaki (1997) provide a lovely discussion of the equivalence of these two methods when applied to the area under the receiver operating characteristic curve or simply the AUC. This blog post follows their discussion, providing concrete examples of computing structural components and pseudo-observations using R, and demonstrating their equivalence in this special case.

Continue reading Nonparametric neighbours: U-statistic structural components and jackknife pseudo-observations for the AUC

Resampling, the jackknife, and pseudo-observations

Resampling methods approximate the sampling distribution of a statistic or estimator. In essence, a sample taken from the population is treated as a population itself. A large number of new samples, or resamples, are taken from this “new population”, commonly with replacement, and within each of these resamples, the estimate of interest is re-obtained. A large number of these estimate replicates can then be used to construct the empirical sampling distribution from which confidence intervals, bias, and variance may be estimated. These methods are particularly advantageous for statistics or estimators for which no standard methods apply or are difficult to derive.

The jackknife is a popular resampling method, first introduced by Quenouille in 1949 as a method of bias estimation. In 1958, jackknifing was both named by Tukey and expanded to include variance estimation. A jackknife is a multipurpose tool, similar to a swiss army knife, that can get its user out of tricky situations. Efron later developed the arguably most popular resampling method, the bootstrap, in 1979 after being inspired by the jackknife.

In Efron’s (1982) book The jackknife, the bootstrap, and other resampling plans, he states,

Good simple ideas, of which the jackknife is a prime example, are our most precious intellectual commodity, so there is no need to apologize for the easy mathematical level.

Despite existing since the 1940’s, resampling methods were infeasible due to the computational power required to perform resampling and recalculate estimates many times. With today’s computing power, the uncomplicated yet powerful jackknife, and resampling methods more generally, should be a tool in every analyst’s toolbox.

Continue reading Resampling, the jackknife, and pseudo-observations