8 Two means comprison test

8.1 What is-it?

The two-sample mean comparison is based on the assumption that the data in each group or sample are independent and approximately follow a normal distribution.
It helps to assess whether the observed difference in means between the groups is statistically significant or if it could be due to random sampling variability.

\(\left(x_{i,k}\right)_{i=1}^{n_k}\), where \(k=1,2\) indicates the sample
Assume that the data are realizations of two independent samples, each being independent and identically distributed:
- \(X_{i,1}, i=1,\cdots,n_1\)
- \(X_{i,2}, i=1,\cdots,n_2\)
Expectations: \(\mu_k=\mathbb{E}\left[X_{i,k}\right]\)
Population variances: \(\sigma_k^2=\mathbb{V}ar\left(X_{i,k}\right)\)
Statistics
- Sample means: \(\overline{X}_k=\dfrac{1}{n_k}\sum_{i=1}^{n_k}X_{i,k}\),
- Samples variances: \(S_k^2=\dfrac{1}{n_k}\sum_{i=1}^{n_k}\left(X_{i,k}-\overline{X}_k\right)^2\)

Several cases are distinguished:

Known variances: \(S^2\left(\overline{X}_1-\overline{X}_2\right)=\dfrac{\sigma_1^2}{n_1}+\dfrac{\sigma_2^2}{n_2}\)
Unknown and equal variances, and large samples sizes: \(S^2\left(\overline{X}_1-\overline{X}_2\right)=\dfrac{n_1S_1^2+n_2S_2^2}{n_1+n_2}\left(\dfrac{1}{n_1}+\dfrac{1}{n_2}\right)\)
Unknown and unequal variances, and large sample sizes: \(S^2\left(\overline{X}_1-\overline{X}_2\right)=\dfrac{S_1^2}{n_1}+\dfrac{S_2^2}{n_2}\)
Unknwon and equal variances, gaussian samples of small sizes: \(S^2\left(\overline{X}_1-\overline{X}_2\right)=\dfrac{n_1S_1^2+n_2S_2^2}{n_1+n_2-2}\left(\dfrac{1}{n_1}+\dfrac{1}{n_2}\right)\)
Unknwon and unequal variances, gaussian samples of small sizes: ‘\(S^2\left(\overline{X}_1-\overline{X}_2\right)=\dfrac{S_1^2}{n_1}+\dfrac{S_2^2}{n_2}\)’ and ‘\(dof=\dfrac{S^4\left(\overline{X}_1-\overline{X}_2\right)}{\dfrac{S_1^4}{n_1^2(n_1-1)}+\dfrac{S_2^4}{n_2^2(n_2-1)}}\)’

\(T=\dfrac{\overline{X}_1-\overline{X}_2}{\sqrt{S^2\left(\overline{X}_1-\overline{X}_2\right)}}\)
Asymptotic or exact distribution of the test statistic under \(\mathcal{H}_0\):

Unknwon and unequal variances, gaussian samples of small sizes: \(\mathcal{T}_{dof}\)

Otherwise: \(\mathcal{N}\left(0, 1\right)\)

In the following, we denote: > - \(\mathcal{L}\) as the exact or asymptotic distribution of the test statistic > - \(q_{\alpha}\) as \(\alpha\)-level quantile of \(\mathcal{L}\)

Two-tailed test: \(W=\left]-\infty, q_{\alpha/2}\right[\cup\left]q_{1-\alpha/2}, +\infty\right[\)
Left-tailed test: \(W=\left]-\infty, q_{\alpha}\right[\)
Right-tailed test: \(W=\left]q_{1-\alpha}, +\infty\right[\)