Practical inference for win measures via U-statistic decomposition

Introduction

In a previous blogpost, I described how complex estimation of U-statistic variance can be simplified using a “structural component” approach introduced by Sen (1960). The structural component approach is very similar to the leave-one-out jackknife. Essentially, the idea behind both of these approaches is that we decompose the statistic into individual contributions. Here, these are referred to as “structural components,” and in the LOO jackknife, these are referred to as “pseudo-values” or sometimes “pseudo-observations.” Construction of these individual quantities differs conceptually somewhat, but in another blogpost, I discuss their one-to-one relationship for specific cases. We can then take the sample variance of these individual contributions to estimate the variance of the statistic.

Estimators for increasingly popular win measures, including the win probability, net benefit, win odds, and win ratio, are obtained using large-sample two-sample U-statistic theory. Variance estimators are complex for these measures, requiring the calculation of multiple joint probabilities.

Here, I demonstrate how variance estimation for win measures can be practically estimated in two-arm randomized trials using a structural component approach. Results and estimators are provided for the win probability, the net benefit, and the win odds. For simplicity, only a single outcome is considered. However, extension to hierarchical composite outcomes is immediate with use of an appropriate kernel function.

Setup

Consider a two-arm randomized trial that allocates N_1 individuals to the treatment arm and N_0 individuals to the control arm. Let X_{1i} represent the i^{th} response in the treated arm, and X_{0j} represent the j^{th} response in the control arm. Assume that responses are i.i.d. within each arm so X_{1i} \sim F_1 and X_{0j} \sim F_0, and that greater responses are indicative of better health (e.g., HDL in cardiovascular disease, CD4 counts in infectious disease).

Win measures

Win measures rely on pairwise comparison of participants in two arms. That is, each treatment participant is compared to each control participant. Whoever responds better within a pair is the winner. If responses are the same, a tie occurs. Win measures tally wins, ties, and losses between trial arms to quantify treatment effects.

Win probability

In two-arm trials, the win probability is the probability that a randomly selected treated participant has a better response than a randomly selected control participant. Any ties are counted as half win and half loss, so the win probability is generally defined as:

(1)   \begin{equation*} \theta = \Pr(X_{1i} > X_{0j}) + 0.5 \Pr(X_{1j} = X_{0j}). \end{equation*}

\theta is framed in terms of the treatment participant responding better than the control participant, or a “treatment win,” so we refer to \theta as the “treatment win probability.” We could conversely frame \theta in terms of a “control win.” However, because a control win is a treatment loss, and vice versa, it follows that the control win probability is simply the complement of the treatment win probability: 1-\theta.

\theta = 0.5 suggests that the probability that one participant responds better than the other is equal to the flip of a fair coin, or no treatment effect. \theta > 0.5 suggests treatment wins more than expected by chance, or treatment benefit. \theta < 0.5 suggests treatment wins less than expected by chance, or treatment harm.

Net benefit

The net benefit, or perhaps “win difference,” is the difference between the treatment win probability and control win probability:

(2)   \begin{equation*} \Delta = \Pr(X_{1i} > X_{0j}) + 0.5 \Pr(X_{1i} = X_{0j}) - \Pr(X_{1i} < X_{0j}) - 0.5 \Pr(X_{1i} = X_{0j}) \end{equation*}

or equivalently, \Delta = \theta - (1-\theta) and \Delta = 2 \theta - 1. Ties cancel with differencing, so the net benefit also reduces to:

(3)   \begin{equation*} \Delta = \Pr(X_{1i} > X_{0j}) - \Pr(X_{1i} < X_{0j}). \end{equation*}

For a single binary outcome with treatment and control success probabilities \pi_1 and \pi_0, the net benefit is equal to the absolute risk difference:

(4)   \begin{equation*} \Delta = \pi_1 (1-\pi_0) - \pi_0 (1-\pi_1) = \pi_1 - \pi_0. \end{equation*}

\Delta = 0 suggests the probability of winning is the same for both arms, \Delta > 0 suggests treatment benefit, and \Delta < 0 suggests treatment harm.

Win odds

The win odds is the ratio of the treatment win probability to the control win probability:

(5)   \begin{equation*} \lambda = \frac{\Pr(X_{1i} > X_{0j})+0.5\Pr(X_{1i}=X_{0j})}{\Pr(X_{1i} < X_{0j}) + 0.5 \Pr(X_{1i} = X_{0j})} \end{equation*}

or equivalently, \lambda = \theta/(1-\theta). Estimators for the win probability can be translated to the win odds through use of the \delta-method.

\lambda represents treatment wins:losses, so \lambda = 1 suggests no treatment effect, \lambda > 1 suggests treatment benefit, and \lambda < 1 suggests treatment harm.

Two-sample U-statistic theory

Win measures are estimated by comparing all pairs consisting of one treatment response and one control response. In this case, there are N1 \times N_0 such pairs. A score is assigned to each pair depending on whether it is a treatment win, tie, or loss. Using different scoring systems will result in estimates of different win measures. The mean of these scores provides a point estimate of the win measure.

You might recognize that this point estimator is a U-statistic! Yes indeed, estimators of the net benefit and win probability take the form of two-sample U-statistics:

(6)   \begin{equation*} U = \frac{1}{N_1 N_0} \sum_{i=1}^{N_1} \sum_{j=1}^{N_0} \phi(X_{1i}, X_{0j}) \end{equation*}

where \phi(X_{1i}, X_{i0}) is a valid two-sample kernel function.

For the win probability,

(7)   \begin{equation*} \phi_{\theta}(X_{1i}, X_{0j}) = \pmb{1}(X_{1i} > X_{0j}) + 0.5 ~\pmb{1}(X_{1i} = X_{0j}) \end{equation*}

where \pmb{1}(\cdot) is an indicator function taking the value 1 when its argument is true and 0 otherwise. \phi_{\theta} assigns a score of +1 to treatment wins (control losses), +0.5 to ties, and 0 to treatment losses (control wins).

For the net benefit,

(8)   \begin{equation*} \phi_{\Delta}(X_{1i}, X_{0j}) = \pmb{1}(X_{1i} > X_{0j}) - \pmb{1}(X_{1i} < X_{0j}) \end{equation*}

\phi_{\theta} assigns a score of +1 to treatment wins (control losses), +0 to ties, and -1 to treatment losses (control wins).

Large-sample theory provides the variance for two-sample U-statistics of this general form,

(9)   \begin{equation*} \text{Var}(U) = \frac{\sigma_{10}^2}{N_1} + \frac{\sigma_{01}^2}{N_0}. \end{equation*}

The variance components \sigma_{10}^2 and \sigma_{01}^2 are the cause of all pain in variance estimation! The form of \text{Var}(U) results from Hoeffding decomposition, which I won’t go into detail about here. The basic idea is that the U-statistic can be split into multiple uncorrelated terms. The variance of these terms provides the variance equation you see above (after the asymptotic zeroing of higher-order terms).

If we take the expectation of the kernel \phi(X_{1i}, X_{0j}) conditional on X_{0j}, the resulting quantity is independent of X_{0j}. This is because when we condition on a quantity, we treat it as fixed (constant). For a generic kernel function, let’s define these quantities for each arm:

(10)   \begin{equation*} \phi_{10}(X_{1i}) = \text{E}[\phi(X_{1i}, X_{0j}) \mid X_{1i}] \end{equation*}

and

(11)   \begin{equation*} \phi_{01}(X_{0j}) = \text{E}[\phi(X_{1i}, X_{0j}) \mid X_{0j}] \end{equation*}

Because the X_{1i} and X_{0j} are i.i.d. within their respective arm, it follows that \phi_{10}(X_{1i}) and \phi_{01}(X_{0j}) are similarly i.i.d. The variance components are then provided by \sigma_{10}^2 = \text{Var}(\phi_{10}(X_{1i})) and \sigma_{01}^2 = \text{Var}(\phi_{01}(X_{0j})).

I’ve figured out the exact form of these variance components for specific U-statistics in previous blog posts. It’s really tricky, even for what you might expect are “simple” U-statistics!!!

Finally, U-statistic U is unbiased and asymptotically normal with mean \theta and variance \text{Var}(\hat{\theta}). Here, U=\hat{\theta} or U=\hat{\Delta} depending on the kernel used. This provides your classic large-sample 95% confidence interval estimator,

(12)   \begin{equation*} U \mp 1.96 \sqrt{\widehat{\text{Var}}(U)} \end{equation*}

Replace 1.96 with your favourite critical value!

Variance estimation via U-statistic structural components

As outlined in a previous blogpost, Sen (1960) teaches us that we can avoid the pain of variance estimation in two steps: (1) TRANSFORM observed responses X_{1i} and X_{0j} into structural components. Structural components are constructed as the within-subject mean of the (N-N_i) corresponding kernel values or “scores,” converge in probability to the quantities \phi_{10}(X_{1i}) and \phi_{01}(X_{0j}), and are approximately uncorrelated in large samples; (2) ESTIMATE the sample variance of the structural components to obtain \widehat{\text{Var}}(U).

Sen, P. K. (1960). On Some Convergence Properties of U-Statistics. Calcutta Statistical Association Bulletin, 10(1-2), 1–18. doi:10.1177/0008068319600101

In the following sections, I demonstrate how to apply these results to obtain practical variance estimators for the win probability, net benefit, and win odds. Then, I generate data and demonstrate their application.

Win probability and win odds

For the win probability, the i^{th} treated structural component corresponding to observed response X_{1i} is defined as,

(13)   \begin{equation*} W_{1i} = \frac{1}{N_0} \sum_{j=1}^{N_0} \phi_{\theta}(X_{1i}, X_{0j}) = \frac{1}{N_0} \sum_{j=1}^{N_0} \pmb{1}(X_{1j} > X_{0j}) + 0.5 ~\pmb{1}(X_{1j} = X_{0j}). \end{equation*}

or the mean of the N_0 scores resulting from pairwise comparison of X_{1i}. Again, W_{1i} approximates \phi_{10}(X_{1i}), or the expected value of the kernel conditional on X_{1i}. Similarly, the j^{th} control structural component corresponding to observed response X_{0j} is defined as,

(14)   \begin{equation*} W_{0j} = \frac{1}{N_1} \sum_{i=1}^{N_1} \phi_{\theta}(X_{1i}, X_{0j}) \end{equation*}

which approximates \phi_{01}(X_{0j}), or the expected value of the kernel conditional on X_{0j}.

Notice that the mean of the treated structural components W_{1i},

(15)   \begin{equation*} \bar{W}_{1.} = \frac{1}{N_1} \sum_{i=1}^{N_1} \left[ \frac{1}{N_0} \phi_{\theta}(X_{1i}, X_{0j}) \right] \end{equation*}

or the control structural components W_{0j},

(16)   \begin{equation*} \bar{W}_{0.} = \frac{1}{N_0} \sum_{j=1}^{N_0} \left[ \frac{1}{N_1} \sum_{i=1}^{N_1} \phi_{\theta}(X_{1i}, X_{0j}) \right] \end{equation*}

equals the U-statistic estimator of the win probability \theta,

(17)   \begin{equation*} \hat{\theta} = \frac{1}{N_1 N_0} \sum_{i=1}^{N_1} \sum_{j=1}^{N_0} \pmb{1}(X_{1j} > X_{0j}) + 0.5 ~\pmb{1}(X_{1j} = X_{0j}) \end{equation*}

and by definition E[\hat{\theta}] = \theta.

We take advantage of the fact that W_{1i} and W_{0j} approximate the \phi_{10}(X_{1i}) and \phi_{01}(X_{0j}) (and are asymptotically uncorrelated), by taking their sample variance to approximate the required variance components \sigma_{10}^2 and \sigma_{01}^2:

(18)   \begin{equation*} s_{10}^2 = \frac{1}{N_1 - 1} \sum_{i=1}^{N_1} (W_{1i} - \hat{\theta})^2 \end{equation*}

and

(19)   \begin{equation*} s_{01}^2 = \frac{1}{N_0 - 1} \sum_{j=1}^{N_0} (W_{0j} - \hat{\theta})^2. \end{equation*}

Then, an estimator of \text{Var}(\hat{\theta}) is obtained as,

(20)   \begin{equation*} \widehat{\text{Var}}(\hat{\theta}) = \frac{s_{10}^2}{N_1} + \frac{s_{01}^2}{N_0}. \end{equation*}

We can also obtain a variance estimator for the win odds, \lambda = \hat{\theta} / (1-\hat{\theta}) by applying the \delta-method:

(21)   \begin{equation*} \widehat{\text{Var}}(\ln \hat{\lambda}) = \frac{ \widehat{\text{Var}}(\hat{\theta})}{\hat{\theta}^2(1-\hat{\theta})^2}. \end{equation*}

Net benefit

For the net benefit, the i^{th} treated structural component corresponding to observed response X_{1i} is defined as,

(22)   \begin{equation*} D_{1i} = \frac{1}{N_0} \sum_{j=1}^{N_0} \phi_{\Delta}(X_{1i}, X_{0j}) = \frac{1}{N_0} \sum_{j=1}^{N_0} \pmb{1}(X_{1i} > X_{0j}) - \pmb{1}(X_{1i} < X_{0j}) \end{equation*}

or the mean of the N_0 scores resulting from pairwise comparison of X_{1i}. Again, D_{1i} approximates \phi_{10}(X_{1i}), or the expected value of the kernel conditional on X_{1i}. Similarly, the j^{th} control structural component corresponding to observed response X_{0j} is defined as,

(23)   \begin{equation*} D_{0j} = \frac{1}{N_1} \sum_{i=1}^{N_1} \phi_{\Delta}(X_{1i}, X_{0j}) \end{equation*}

which approximates \phi_{01}(X_{0j}), or the expected value of the kernel conditional on X_{0j}.

Notice that the mean of the treated structural components D_{1i},

(24)   \begin{equation*} \bar{D}_{1.} = \frac{1}{N_1} \sum_{i=1}^{N_1} \left[ \frac{1}{N_0} \phi_{\Delta}(X_{1i}, X_{0j}) \right] \end{equation*}

or the control structural components D_{0j},

(25)   \begin{equation*} \bar{D}_{0.} = \frac{1}{N_0} \sum_{j=1}^{N_0} \left[ \frac{1}{N_1} \sum_{i=1}^{N_1} \phi_{\Delta}(X_{1i}, X_{0j}) \right] \end{equation*}

equals the U-statistic estimator of the net benefit \delta,

(26)   \begin{equation*} \hat{\Delta} = \frac{1}{N_1 N_0} \sum_{i=1}^{N_1} \sum_{j=1}^{N_0} \pmb{1}(X_{1i} > X_{0j}) - \pmb{1}(X_{1i} < X_{0j}) \end{equation*}

and by definition E[\hat{\Delta}] = \Delta.

We take advantage of the fact that D_{1i} and D_{0j} approximate the \phi_{10}(X_{1i}) and \phi_{01}(X_{0j}) (and are asymptotically uncorrelated), by taking their sample variance to approximate the required variance components \sigma_{10}^2 and \sigma_{01}^2:

(27)   \begin{equation*} s_{10}^2 = \frac{1}{N_1 - 1} \sum_{i=1}^{N_1} (D_{1i} - \hat{\Delta})^2 \end{equation*}

and

(28)   \begin{equation*} s_{01}^2 = \frac{1}{N_0 - 1} \sum_{j=1}^{N_0} (D_{0j} - \hat{\Delta})^2. \end{equation*}

Then, an estimator of \text{Var}(\hat{\Delta}) is obtained as,

(29)   \begin{equation*} \widehat{\text{Var}}(\hat{\Delta}) = \frac{s_{10}^2}{N_1} + \frac{s_{01}^2}{N_0}. \end{equation*}

Notice that because \phi_{\Delta}(X_{1i}, X_{0j}) = 2 \phi_{\theta}(X_{1i}, X_{0j}) -1, analogous to the relationship between the net benefit and win probability, \widehat{\text{Var}}(D_{1i}) = 4 \widehat{\text{Var}}(W_{1i}).

Simulated examples

Variance estimation using structural components can be performed with a few lines of code in R and SAS. We generate a single binary outcome measure according to a fixed value of the win probability, and obtain point and variance estimates of the win probability, the win odds, and the net benefit using the methods described above.

Let X_{1i} \sim \text{Bernoulli}(\pi_1) and X_{0j} \sim \text{Bernoulli}(\pi_0). In this special case, the net benefit \Delta = \pi_1 - \pi_0 and it follows that the win probability \theta = (\Delta+1)/2 = (\pi_1 - \pi_0 + 1)/2. Fixing \pi_1 = 0.5 suggests that \pi_0 = 0.22 to obtain win probability \theta = 0.64 and \Delta=0.28.

# Event probability (1=treated, 0=control)
Pi1 = 0.5
Pi0 = 0.22
# Function to calculate win probability given Pi1, Pi0
Theta <- function(Pi1, Pi0){ (Pi1-Pi0+1)/2 }

# Function to calculate net benefit given Pi1, Pi0
Delta <- function(Pi1, Pi0){ (Pi1-Pi0) }

# Function to calculate win odds given Pi1, Pi0
Lambda <- function(Pi1, Pi0){ (1+(Pi1-Pi0))/(1-(Pi1-Pi0)) }
# True value of win probability
Theta(Pi1, Pi0)
## [1] 0.64
# True value of net benefit
Delta(Pi1, Pi0)
## [1] 0.28
# True value of win odds
Lambda(Pi1, Pi0)
## [1] 1.777778

Independently and identically generate N_1 = N_0 = 100 binary responses for each arm, corresponding perfect 1:1 randomization of 200 total participants.

# Sample size       (1=treated, 0=control)
N1  = 100 
N0  = 100

# Set seed!!!
set.seed(12345)
# Generate i.i.d. response in each arm
# X1i ~ Binomial(n=1, Pi1)
# X0j ~ Binomial(n=1, Pi0)
X1 <- rbinom(n=N1, size=1, prob=Pi1)
X0 <- rbinom(n=N0, size=1, prob=Pi0)
# Event (1) and non-event (0) in treated arm
prop.table(table(X1))
## X1
##    0    1 
## 0.48 0.52
# Event (1) and non-event (0) in control arm
prop.table(table(X0))
## X0
##    0    1 
## 0.67 0.33

Win probability

First, lets construct the kernel function for the win probability.

# Win Probability = Pr(X1i > X0j) + 0.5 Pr(X1i = X0j)
# kernel function for win probability Theta
kTheta <- function(X1i, X0j){ ifelse(X1i > X0j, 1, 0) + 0.5 * ifelse(X1i == X0j, 1, 0) }
# Treatment Win
kTheta(X1i=1, X0j=0)
## [1] 1
# Tie
kTheta(X1i=1, X0j=1)
## [1] 0.5
# Tie 
kTheta(X1i=0, X0j=0)
## [1] 0.5
# Treatment Loss
kTheta(X1i=0, X0j=1)
## [1] 0

Second, lets use the kernel to score or compare all treatment and control participant pairs.

# Matrix to store scores (kernel values) for each participant pair
# row = treated participant
# column = control participant
kMatrix <- matrix(NA, nrow=N1, ncol=N0)

# Double loop to compute and store all kernel values
for (i in 1:N1){
  for (j in 1:N0){
    kMatrix[i, j] = kTheta(X1[i], X0[j])
  }
}

Third, construct the treated structural components as the row means and control structural components as the column means.

W1i <- 1/N0 * rowSums(kMatrix)
W0j <- 1/N1 * colSums(kMatrix)

The means of the structural components should be the same between-arms:

# Treatment Win probability = Mean(Treated Structural Components)
theta = mean(W1i)
theta
## [1] 0.595
# Treatment Win probability = Mean(Control Structural Components)
theta = mean(W0j)
theta
## [1] 0.595

and represent our point estimate of the win probability! That is, \hat{\theta} = 0.595. Recall true \theta = 0.64.

Third, the variance is easily estimated by taking the sample variances of W1i and W0j:

# First variance component, V10=Var[E[K(X1i, X0j) | X1i]]
var10W <- var(W1i)
var10W
## [1] 0.0630303
# Second variance component, V01=Var[E[K(X1i, X0j) | X0j]]
var01W <- var(W0j)
var01W
## [1] 0.05583333
# V(hat(Theta)) = V10(W1j)/N1 + V01(W0j)/N0
vartheta <- var10W/N1 + var01W/N0
vartheta
## [1] 0.001188636

Finally, the large-sample 95% confidence interval for the win probability:

wlower = theta - 1.96 * sqrt(vartheta)
wupper = theta + 1.96 * sqrt(vartheta)
c(wlower, wupper)
## [1] 0.5274259 0.6625741

The null win probability \theta_0 = 0.5 is not contained within the 95% confidence interval, suggesting there is a statistically significant difference between arms at the 5% level. The point estimate \hat{\theta}= 0.595 suggests that a random treated participant will have a better response than (win over) a random control participant 59.5% of the time (technically with adjustment for ties). The bounds suggest win probabilities as low as 52.7% and as high as 66.3% are compatible with the data. The true value of the win probability \theta = 0.64 is contained within the interval.

Win odds

We can transform win probability results to the win odds:

# Win Odds = Treatment Win Probability / Treatment Loss Probability
# Lambda = Theta/(1-Theta)
lambda = theta/(1-theta)
lambda
## [1] 1.469136
# Take the log for delta method
lnlambda = log(lambda)
lnlambda
## [1] 0.3846743

Then apply the \delta-method to \ln(\lambda) to obtain a variance estimator for confidence interval construction.

# Calculate variance according to delta method result
varlambda = vartheta / (theta*(1-theta))^2
varlambda
## [1] 0.0204694

Construct the 95% confidence interval first on the log scale,

lnlower = lnlambda - 1.96 * sqrt(varlambda)
lnupper = lnlambda + 1.96 * sqrt(varlambda)
c(lnlower, lnupper)
## [1] 0.1042546 0.6650941

then back-transform the confidence interval to the original scale,

c(exp(lnlower), exp(lnupper))
## [1] 1.109883 1.944673

The null win odds \lambda_0 = 1 is not contained within the 95% confidence interval, suggesting there is a statistically significant difference between arms at the 5% level. The point estimate \hat{\lambda}= 1.47 suggests that the treatment win probability is 47% greater (relative) than the control win probability. The bounds suggest win odds as low as 1.11 and as high as 1.94 are compatible with the data. The true value of the win odds \lambda = 1.78 is contained within the interval.

Net benefit

First of all, we can also get estimates of the net benefit using the win probability by recognizing that \Delta = 2 \theta - 1,

# Net Benefit = Treatment Win Probability - Control Win Probability
# Net Benefit = 2 * Treatment Win Probability - 1 
delta = 2 * theta - 1
delta
## [1] 0.19

and subsequently \widehat{\text{Var}(\hat{\Delta})} = 4 \widehat{\text{Var}}(\hat{\theta}) = 4 \left[ \widehat{\text{Var}}(\bar{W}_{1.}) + \widehat{\text{Var}}(\bar{W}_{0}) \right],

vardelta = 4 * vartheta
vardelta
## [1] 0.004754545

and construct a 95% confidence interval for \Delta,

dlower = delta - 1.96 * sqrt(vardelta)
dupper = delta + 1.96 * sqrt(vardelta)
c(dlower, dupper)
## [1] 0.0548517 0.3251483

Though you can just as easily apply the same transform to the 95% confidence interval bounds for \theta

2*c(wlower, wupper)-1
## [1] 0.0548517 0.3251483

But, lets also do the structural component approach the long way for good measure! We are all about good measure here.

First, lets construct the kernel function for the net benefit,

# Net Benefit = Pr(X1i > X0j) - Pr(X1i < X0j)
# kernel function for net benefit Delta
kDelta <- function(X1i, X0j){ ifelse(X1i > X0j, 1, 0) - ifelse(X1i < X0j, 1, 0) }
# Treatment Win
kDelta(X1i=1, X0j=0)
## [1] 1
# Tie
kDelta(X1i=1, X0j=1)
## [1] 0
# Tie 
kDelta(X1i=0, X0j=0)
## [1] 0
# Treatment Loss
kDelta(X1i=0, X0j=1)
## [1] -1

Second, lets use the kernel to score or compare all treatment and control participant pairs.

# Matrix to store scores (kernel values) for each participant pair
# row = treated participant
# column = control participant
kMatrix <- matrix(NA, nrow=N1, ncol=N0)

# Double loop to compute and store all kernel values
for (i in 1:N1){
  for (j in 1:N0){
    kMatrix[i, j] = kDelta(X1[i], X0[j])
  }
}

Third, construct the treated structural components as the row means and control structural components as the column means.

D1i <- 1/N0 * rowSums(kMatrix)
D0j <- 1/N1 * colSums(kMatrix)

The means of the structural components should be the same between-arms:

# Net Benefit = Mean(Treated Structural Components)
delta = mean(D1i)
delta
## [1] 0.19
# Net Benefit = Mean(Control Structural Components)
delta = mean(D0j)
delta
## [1] 0.19

and represent our point estimate of the net benefit! That is, \hat{\Delta} = 0.595. Recall true \theta = 0.28.

Third, the variance is easily estimated by taking the sample variances of D1i and D0j:

# First variance component, V10=Var[E[K(X1i, X0j) | X1i]]
var10D <- var(D1i)
var10D
## [1] 0.2521212
# Second variance component, V01=Var[E[K(X1i, X0j) | X0j]]
var01D <- var(D0j)
var01D
## [1] 0.2233333
# V(hat(Delta)) = V10(D1j)/N1 + V01(D0j)/N0
vardelta <- var10D/N1 + var01D/N0
vardelta
## [1] 0.004754545

Finally, the large-sample 95% confidence interval for the net benefit:

dlower = delta - 1.96 * sqrt(vardelta)
dupper = delta + 1.96 * sqrt(vardelta)
c(dlower, dupper)
## [1] 0.0548517 0.3251483

Same as before, far out!

The null net benefit \Delta_0 = 0.5 is not contained within the 95% confidence interval, suggesting there is a statistically significant difference between arms at the 5% level. The point estimate \hat{\Delta}= 0.19 suggests that the treatment win probability is 19% greater (absolute) than the control win probability. The bounds suggest net benefits as low as 5% and as high as 33% are compatible with the data. The true value of the net benefit \Delta = 0.28 is contained within the interval.

Confidence interval coverage

Just for fun, let’s repeat this process 1,000 times for specifically the win probability and sample sizes N_1=N_0=10, N_1=N_0=30, and N_1=N_0=100 to assess estimator performance in smaller and larger samples with respect to 95% confidence interval coverage.

N <- c(10, 30, 100)
B <- 1000
Results <- cbind('n'=NA, 'b'=NA, 'theta'=NA, 'wlower'=NA, 'wupper'=NA, 'cover'=NA)

# for each of 3 sample sizes
for (i in 1:3){
  n <- N[i]
  # for each of 1000 replicates
  for (b in 1:B){
    # generate binary responses
    X1 <- rbinom(n=n, size=1, prob=Pi1)
    X0 <- rbinom(n=n, size=1, prob=Pi0)

    # Double loop to compute and store all kernel values
    kMatrix <- matrix(NA, nrow=n, ncol=n)
    for (i in 1:n){
      for (j in 1:n){
        kMatrix[i, j] = kTheta(X1[i], X0[j])
      }
    }
    
    # Construct structural components
    W1i <- 1/n * rowSums(kMatrix)
    W0j <- 1/n * colSums(kMatrix)
    
    # Estimate win probability (using treatment structural components)
    theta = mean(W1i)
    # First variance component, V10=Var[E[K(X1i, X0j) | X1i]]
    var10W <- var(W1i)
    # Second variance component, V01=Var[E[K(X1i, X0j) | X0j]]
    var01W <- var(W0j)
    # V(hat(Theta)) = V10(W1j)/N1 + V01(W0j)/N0
    vartheta <- var10W/N1 + var01W/N0
    
    # Build confidence intervals and check coverage
    # wlower = lower bound of 95% confidence interval
    # wupper = upper bound of 95% confidence interval
    # cover  = 1 if truth Theta in confidence bounds
    wlower = theta - 1.96 * sqrt(vartheta)
    wupper = theta + 1.96 * sqrt(vartheta)
    clower = (wlower < Theta(Pi1, Pi0))
    cupper = (wupper > Theta(Pi1, Pi0))
     cover = clower*cupper
     
    # Store replicate result
    result <- c(n, b, theta, wlower, wupper, cover)
    Results <- rbind(Results, result)
  }
}

# Remove first row of NAs from Results
Results <- data.frame(Results[-1,])

Now lets look at the results by sample size!

# Simulation results when N1=N0=10
round(colMeans(Results[Results$n==10,]), 2)
##      n      b  theta wlower wupper  cover 
##  10.00 500.50   0.64   0.58   0.71   0.53
# Simulation results when N1=N0=30
round(colMeans(Results[Results$n==30,]), 2)
##      n      b  theta wlower wupper  cover 
##  30.00 500.50   0.64   0.57   0.70   0.75
# Simulation results when N1=N0=100
round(colMeans(Results[Results$n==100,]), 2)
##      n      b  theta wlower wupper  cover 
## 100.00 500.50   0.64   0.58   0.70   0.95

First, they confirm our estimator is unbiased. We see that coverage is conservative when there is N_1=N_0=10 participants in each arm, with only ~50% of 95% confidence containing the true value of \theta = 0.64. However, coverage increases to the expected 95% when N_1=N_0=100. Results for N_1=N_0=30 fall somewhere in between.

Summary

In summary, decomposing U-statistics into their arm-specific structural components improves the practicality of variance estimation significantly. All one has to do is take the within-subject mean of the kernel to obtain structural components for all participants. The within-arm sample variance of the structural components can then be used to approximate the variance of the U-statistic. These steps are much easier to accomplish than parsing and implementing complex joint probabilities!!! You just need to know the correct kernel. The methods described here are extremely flexible, and can also be applied to U-statistics outside the realm of win measures.

An important thing to remember is that all these results depend on LARGE SAMPLES. That is, the form of the variance estimator is based on asymptotic zeroing of higher order terms, and validity of the sample variance estimators depends on the structural components being asymptotically uncorrelated. Because construction of the structural components shares the same sample observations, they have non-negligible correlation in small samples, which will result in biased variance estimates. We saw this in our simulation study. However, it has been proven that the variance will always be conservatively biased, and we emphasize that this bias is mitigated as sample sizes increase.

The fact that all these measures are connected to each other and other popular treatment effects, and yet all have their own pros and cons, is cool to me. They definitely do work well together. It’s fun how estimators for the net benefit and win odds can be recovered from the win probability. The variance of the win ratio is a bit trickier to estimate because it excludes ties, so we can’t use this same trick. Its ratio form means it also does not have a nice linear kernel like the win probability or net benefit. Neither does the win odds, but its relationship with the win probability provides estimators. The current method uses two U-statistics, one for the numerator and one for the denominator, and then take their ratio on an appropriate scale. In this case, multivariate, or at least bivariate, U-statistic theory is needed. Makes sense to me… I wonder if this could be used to simplify calculation of covariances.
I hope this makes your life just a little bit easier, and thanks for reading my blog!

Published by

Emma Davies Smith

Emma Davies Smith holds a Ph.D. in Biostatistics from the University of Western Ontario and is currently a Research Associate at the Center for Biostatistics in AIDS Research (CBAR) at the Harvard T. H. Chan School of Public Health. Her current research interests include randomized controlled trials, correlated data, ordinal outcomes, and infectious disease.

One thought on “Practical inference for win measures via U-statistic decomposition”

Leave a Reply

Your email address will not be published. Required fields are marked *