My previous two blog posts revolved around derivation of the limiting distribution of U-statistics for one sample and multiple independent samples.
For derivation of the limiting distribution of a U-statistic for a single sample, check out Getting to know U: the asymptotic distribution of a single U-statistic.
For derivation of the limiting distribution of a U-statistic for multiple independent samples, check out Much Two U About Nothing: Extension of U-statistics to multiple independent samples.
The notation within these derivations can get quite complicated and it may be a bit unclear as to how to actually derive components of the limiting distribution.

In this blog post, I provide two examples of both common one-sample U-statistics (Variance, Kendall’s Tau) and two-sample U-statistics (Difference of two means, Wilcoxon Mann-Whitney rank-sum statistic) and derive their limiting distribution using our previously developed theory.
Asymptotic distribution of U-statistics
One sample
For a single sample,
, the U-statistic is given by
![]()
where
is a symmetric kernel of degree
.
For a review of what it means for
to be symmetric, check out U-, V-, and Dupree Statistics.
In the examples covered by this blog post,
, so we can re-write
as,
![]()
Alternatively, this is equivalent to,
![]()
The limiting variance of
is given by,
![]()
where
![]()
or equivalently,
![]()
Note that when
,
.
For
, these expressions reduce to
![]()
where
![]()
and
![]()
The limiting distribution of
for
is then,
![]()
For derivation of the limiting distribution of a U-statistic for a single sample, check out Getting to know U: the asymptotic distribution of a single U-statistic.
Two independent samples
For two independent samples denoted
and
, the two-sample U-statistic is given by
![Rendered by QuickLaTeX.com \[ U = \frac{1}{{m \choose a}{n \choose b}} \mathop{\sum \sum} \limits_{\substack{1 \leq i_1 < ... < i_{a} \leq m \\ 1 \leq j_1 < ... < j_b \leq n}} \phi(X_{i_1}, ..., X_{i_a}; Y_{j_1}, ..., Y_{j_b}). \]](https://statisticelle.com/wp-content/ql-cache/quicklatex.com-01a00cdd4ce396510f035ae3dde6507c_l3.png)
where
is a kernel that is independently symmetric within the two blocks
and
.
In the examples covered by this blog post,
, reducing the U-statistic to,
![]()
The limiting variance of
is given by,
![]()
where
![]()
and
![]()
Equivalently,
![]()
and
![]()
For
, these expressions reduce to
![]()
where
![]()
and
![]()
The limiting distribution of
for
and
is then,
![]()
For derivation of the limiting distribution of a U-statistic for multiple independent samples, check out Much Two U About Nothing: Extension of U-statistics to multiple independent samples.
Examples of one-sample U-statistics
Variance
Suppose we have an independent and identically distributed random sample of size
,
.
We wish to estimate the variance, which can be expressed as an expectation functional,
![]()
In order to estimate
using a U-statistic, we need to identify a kernel function that is unbiased for
and symmetric in its argument. We start by considering,
![]()
is unbiased for
since
![]()
but is not symmetric since
![]()
Thus, the corresponding symmetric kernel can be constructed as
![]()
Here, the number of arguments
and
is the set of all permutations of the
arguments,
![]()
Then, the symmetric kernel which is unbiased for the variance is,
![]()
An unbiased estimator of
is then the U-statistic,
![]()
or equivalently,
![]()
Focusing on the second form of the sum and recognizing that
![]()
and,
![Rendered by QuickLaTeX.com \[ \sum_{i \neq j}^{n} X_i = \sum_{i=1}^{n} \left( \sum_{j=1}^{n} X_j - X_i \right)\]](https://statisticelle.com/wp-content/ql-cache/quicklatex.com-dc991787ed87c63d6e4577aee213d003_l3.png)
we have,
![Rendered by QuickLaTeX.com \begin{align*} \sum_{i \neq j} (X_{i} - X_{j})^2 &= \sum_{i \neq j} X_{i}^2 - 2 X_{i} X_{j} + X_{j}^2 \\ &= 2 \Big( \sum_{i \neq j} X_{i}^2 - X_{i} X_{j} \Big) \\ &= 2 \left(\sum_{i=1}^{n} \left[ \sum_{j=1}^{n} X_j^{2} - X_i^2 \right] - \sum_{i=1}^{n} X_i \left[\sum_{j=1}^{n} X_j - X_i \right] \right) \\ &= 2 \left( n \sum_{j=1}^{n} X_j^{2} - \sum_{i=1}^{n} X_{i}^2 - n \bar{X} \sum_{i=1}^{n} X_i + \sum_{i=1}^{n} X_{i}^2 \right) \\ &= 2 n \sum_{i=1}^{n} X_{i}^2 - 2 n \bar{X} \sum_{i=1}^{n} X_i \\ &= 2 n \sum_{i=1}^{n} X_{i}^2 - 2 n^2 \bar{X}^2. \end{align*}](https://statisticelle.com/wp-content/ql-cache/quicklatex.com-36a2be660dfd6ea80467387fe8c103e6_l3.png)
Plugging this simplified expression back into our formula for
, we obtain
![Rendered by QuickLaTeX.com \begin{align*} U_n &= \frac{1}{n(n-1)} \left[n \sum_{i=1}^{n} X_{i}^2 - n^2 \bar{X}^2 \right] \\ &= \frac{1}{n-1} \left[ \sum_{i=1}^{n} X_{i}^2 - \frac{1}{n} \left( \sum_{i=1}^{n} X_i \right)^2 \right] \\ &= s_n^{2} \end{align*}](https://statisticelle.com/wp-content/ql-cache/quicklatex.com-05819f38fcf1715dbb697b1d28889898_l3.png)
as desired.
It is well-known that
is the unbiased estimator of the sample variance such that,
![]()
but what about the variance of
? For a sample size of
and
,
![]()
To derive the first variance component
, we start by taking the expectation of our kernel conditional on
,
![Rendered by QuickLaTeX.com \begin{align*} \phi_1(X_1) &= \mathbb{E}_F \left[ \frac{(X_1 - X_2)^2}{2} \middle| X_1 \right] \\ &= \mathbb{E}_F \left[ \frac{(X_2 - x_1)^2}{2} \middle| X_1 \right] \\ &= \mathbb{E}_F \left[ \frac{(X_2 - \mu + \mu - x_1)^2}{2} \middle| X_1 \right] \\ &= \mathbb{E}_F \left[ \frac{(X_2 - \mu)^2 + 2 (X_2 - \mu)(x_1-\mu) + (\mu - x_1)^2}{2}\right] \\ &= \frac{\sigma^2}{2} + \frac{(x_1 - \mu)^2}{2}. \end{align*}](https://statisticelle.com/wp-content/ql-cache/quicklatex.com-a054e65f3a4dd8187fac9ecfcc2fc6cf_l3.png)
Now, our first variance component
is just equal to the variance of
and since
is just a constant, we have
![Rendered by QuickLaTeX.com \begin{align*} \sigma_{1}^{2} &= \text{Var}_F~\phi_1(X_1) \\ &= \frac{1}{4}\text{Var}_F \left[ (X_1 - \mu)^2\right] \\ &= \frac{1}{4} \left( \mathbb{E}_F \left[ (X_1 - \mu)^4 \right] - \mathbb{E}_{F} \left[ (X_1 - \mu)^2 \right] \right) \\ &= \frac{\mu_4 - \sigma^4}{4} \end{align*}](https://statisticelle.com/wp-content/ql-cache/quicklatex.com-c75d2f406fe3cf7f6e3e9c662ae2f13e_l3.png)
where
is the fourth central moment.
Next, recognizing that
and recycling our “add zero” trick yields an expression for our second variance component
,
![Rendered by QuickLaTeX.com \begin{align*} \sigma_{2}^{2} &= \text{Var}_{F}~ \phi(X_1, X_2) \\ &= \text{Var}_{F}~ \left[ \frac{(X_1 - X_2)^2}{2} \right] \\ &= \mathbb{E}_F \left[ \frac{(X_1 - \mu + \mu - X_2)^4}{4} \right] - \mathbb{E}_F \left[ \frac{(X_1 - X_2)^2}{2} \right]^2. \end{align*}](https://statisticelle.com/wp-content/ql-cache/quicklatex.com-0dbd7daecf2b69bc95f0b4e18a912928_l3.png)
We know by definition that the kernel is an unbiased estimator of
by definition so that,
![]()
To simplify the remaining expectation, recall that,
![]()
and let
and
. Then,
![]()
Substituting this back into our expression for
, we have
![]()
Finally, plugging our two variance components into our expression for
,
![]()
Then, our asymptotic result for
tells us,
![]()
Kendall’s Tau
Consider
bivariate, continuous observations of the form
![]()
A pair of observations,
is considered “concordant” if
![]()
and “discordant” otherwise.
The probability that two observations are concordant is then,
![]()
and the probability that two observations are discordant is then,
![]()
Kendall’s Tau, denoted
, is the proportion of concordant pairs minus the proportion of discordant pairs, or the difference between
and
such that,
![Rendered by QuickLaTeX.com \begin{align*} \tau &= c - (1-c) \\ &= 2c - 1 \\ &= 2 \left[ P(X_i < X_j, Y_i < Y_j) + P(X_i > X_j, Y_i > Y_j) \right] - 1. \end{align*}](https://statisticelle.com/wp-content/ql-cache/quicklatex.com-d6f1c8b02ee1a6df6d75f539e6aee793_l3.png)
ranges between
and
and is used as a measure of the strength of monotone increasing/decreasing relationships, with
suggesting that
and
are independent and
suggesting a perfect monotonic increasing relationship between
and
.
Based on our definition of
, the form of the symmetric kernel is immediately obvious,
![]()
where
is an indicator function taking the value
when its argument is true and
otherwise.
Note that
![]()
and
![]()
so that our kernel may be re-expressed as,
![Rendered by QuickLaTeX.com \begin{align*} \phi((x_i, y_i), (x_j, y_j)) &= 2 \mathbb{I}(x_i < x_j) \mathbb{I}(y_i < y_j) + 2[1-\mathbb{I}(x_i < x_j)][1- \mathbb{I}(y_i < y_j)] - 1 \\ &= 4 \mathbb{I}(x_i < x_j) \mathbb{I}(y_i < y_j) - 2 \mathbb{I}(x_i < x_j) - 2\mathbb{I}(y_i < y_j) + 1 \\ &= [2\mathbb{I}(x_i < x_j) - 1][2\mathbb{I}(y_i < y_j)-1] \\ &= [1-2\mathbb{I}(x_j < x_i)][1-2\mathbb{I}(y_j < y_i)]. \end{align*}](https://statisticelle.com/wp-content/ql-cache/quicklatex.com-91e4283d2181534dcd155ee2e408e5a7_l3.png)
This will come in handy later.
Now that we have identified our kernel function, we can construct our U-statistic,
![]()
It is obvious that
. Once again,
and the variance of
is given by,
![]()
For the purposes of demonstration and to simplify derivation of the variance components, suppose we are operating under the null hypothesis that
and
are independent, or equivalently
![]()
To find our first variance component
, we must find the expectation of our kernel conditional on
,
![Rendered by QuickLaTeX.com \begin{align*} \phi_1((X_1, Y_1)) &= \mathbb{E} \left[ \phi((X_1, Y_1), (X_2, Y_2)) \middle| (X_1, Y_1) \right] \\ &= \mathbb{E} \Big[ [1-2\mathbb{I}(X_2 < x_1)][1-2\mathbb{I}(Y_2 < y_1)] \Big]. \end{align*}](https://statisticelle.com/wp-content/ql-cache/quicklatex.com-80b72e58e53c2840d59b630beacaec80_l3.png)
If
and
, then
and,
![]()
.
Then, the first variance component is given by,
![]()
and
are independent random variables distributed according to
.
If
then
. Thus, if we let
and
,
and
are both distributed according to
.
Since
and
are independent, applying the identity
yields,
![]()
Recall that if
,
![]()
For
and
, we have
![]()
and
![]()
The same is true for
.
Plugging our results back into our equation for
yields,
![]()
Next,
and,
![]()
By definition,
so that,
![]()
Note that since
and
are identically distributed and continuous, either
or
, so that
![]()
.
Then we can use the properties of the Bernoulli distribution to derive the properties of
we need. That is,
![]()
![]()
and
![]()
Finally, we have
![]()
The same arguments hold for
and we obtain,
![]()
However, since
under the null hypothesis,
.
Now that we have determined the value of
and
under the null hypothesis that
and
are independent, we can plug these components into our formula for
, giving us
![]()
Our asymptotic result for
tells us,
![]()
Examples of two-sample U-statistics
Mean comparison
Suppose we have two independent random samples of size
and size
,
![]()
and
![]()
We wish to compare the means of the two groups. The obvious choice for our kernel is,
![]()
so that
and our corresponding U-statistic is,
![]()
Based on our previous derivation of the distribution of two-sample U-statistics, we have
![]()
For the first variance component, we need to take the expectation of
conditional on a single
such that,
![]()
Similarly, for the second variance component, we need to condition on a single
such that,
![]()
Since
and
are just constants, it is easy to see that,
![]()
and,
![]()
Finally, plugging these variance components into our formula for
, we obtain the variance we would expect for a comparison of two means,
![]()
Wilcoxon Mann-Whitney rank-sum test
Suppose we have two independent random samples of size
and size
,
![]()
and
![]()
We assume that
and
are continuous so that no tied values are possible. Let
rpresent the full-sample ranks of the
and
represent the ranks of the
.
Then, the Wilcoxon Mann-Whitney (WMW) rank-sum statistic is,
![]()
which can be shown to be equivalent to the number of pairs
for which
. That is, we can re-express the WMW statistic as,
![]()
If we divide
by the total number of
pairs, we obtain
![]()
which is exactly the form of a two-sample U-statistic with
and
,
![]()
so that
.
is commonly referred to as the probabilistic index.
For more information on the probabilistic index for two continuous outcomes, check out The probabilistic index for two normally distributed outcomes.
Our previous work tells us that
![]()
The first variance component
can be expressed as,
![]()
Recall that covariance can be expressed in terms of expectation as,
![]()
so that,
![]()
By definition,
![]()
Now, notice that
![]()
so that,
![]()
Following similar logic for
, it should be clear that we have
![]()
and
![]()
Under the null hypothesis
,
and
have the same (continuous) distribution so that either
or
, implying
under
.
Similarly, there are 6 equally likely orderings of
, and
under
: (1)
, (2)
, (3)
, (4)
, (5)
, and (6)
. Then,
![]()
Noting that
, plugging these values into our expressions for
and
gives us,
![]()
Finally,
![]()
Consequently, since
, we have
![]()
In summary, our multiple-sample U-statistic theory tells us that under the null hypothesis
,
![]()
and
![]()
Click here to download this blog post as an RMarkdown (.Rmd) file!

Thank you for the very clear explanation!
Happy to help! Thanks for the comment. ☺️