Overview
In a previous blogpost, Comprehending complex designs: Cluster randomized trials, I walked through the nuances and challenges of cluster randomized trials (CRTs). Cluster randomized trials randomize groups of individuals, such as families or clinics, rather than individuals themselves. Cluster randomized trials are used for a variety of reasons, including evaluating the spread of infectious disease within a household or evaluating whether a new intervention is effective or feasible in real-world settings. Participants within the same cluster may share the same environment or care provider, for example, leading to correlated responses. If this intracluster correlation is not accounted for, variances will be underestimated and inference methods will not have the operating characteristics (i.e., type I error) we expect. Linear mixed models represent one approach for obtaining cluster-adjusted estimates, and their application was demonstrated using data from the SHARE cluster trial evaluating different sex ed curriculums (interventions) in schools (clusters).
Individually randomized group treatment trials (IRGTs) are closely related to CRTs, but can require slightly more complex analytic strategies. IRGT designs arise naturally when individuals do not initially belong to a group or cluster, but are individually randomized to receive a group-based intervention or receive treatment through a shared agent. As a result, individuals are independent at baseline, but intracluster correlation can increase with follow-up as individuals interact within their respective group or with their shared agent. IRGTs can be “fully-nested,” meaning that both the control and experimental conditions feature a group-based intervention, or “partially-nested,” meaning that the experimental condition is group-based while the control arm is not. A fully-nested IRGT may be used to compare structured group therapy versus group discussion for mental health outcomes, for example. If both arms feature groups and the same intracluster correlation, analysis of fully-nested IRGTs is practically identical to that of CRTs. In comparison, a partially-nested IRGT may be used to compare group therapy versus individual standard of care or a waitlist control, for example. Analysis of partially-nested IRGTs is more complex because intracluster correlation is only present in one arm, and methods must be adapted to handle heterogeneous covariance or correlation matrices. If fully-nested but arms do not share the same correlation, similar considerations are required.
To provide insight into data generating mechanisms and inference, this blog post demonstrates how to simulate normally distributed outcomes from (1) a two-arm cluster randomized trial and (2) a two-arm, partially-nested individually randomized group treatment trial. I only use base R for data generation, so these approaches can be widely implemented. Simulation of complex trial designs is helpful for sample size calculation and understanding operating characteristics of inference methods in different scenarios, such as small samples. Analysis of the simulated data proceeds using linear mixed models fit by the nlme library. Visualization uses ggplot2.
