Two proportions comparison test

What is-it?

  • A two proportions test, also known as a two-sample proportion test or a test of two proportions, is a statistical method used to compare the proportions of two independent groups or samples and assess whether there is a significant difference between them.

  • The two proportions test is commonly used when working with categorical data and aims to determine if the observed proportions in two samples are significantly different from each other. It is often used to compare the success rates or proportions of a specific attribute or characteristic between two groups.

Data

  • Sample \(k\in\left\{1, 2\right\}\): \(\left(x_{i,k}\right)_{i=1}^{n_k}\) where \(x_{i,k}\in\left\{0, 1\right\}\) is the observation of Bernoulli variable \(X_{i,k}\) on inndividual \(i\) in sample \(k\).

  • Overall sample size: \(n=n_1+n_2\)

  • Observed proportions: \(p_k=\dfrac{1}{n_k}\sum_{i=1}^{n_k}X_{i,k}\)

  • Let \(\pi_k=\mathbb{E}\left[X_{i,k}\right]\)

Hypotheses

Null Hypothesis

  • \(\mathcal{H}_0\): \(\pi_1=\pi_2\)

Alternative Hypotheses \(\mathcal{H}_1\)

  • Two-tailed: \(\pi_1\neq \pi_2\)

  • Left-tailed: \(\pi_1 < \pi_2\)

  • Right-tailed: \(\pi_1 > \pi_2\)

Test Statistic

  • Common observed proportion: \(p=\frac{n_1p_1+n_2p_2}{n_1+n_2}\)

  • Variance: \(S^2(p_1-p_2)=p(1-p)\left(\frac{1}{n_1}+\frac{1}{n_2}\right)\)

  • \(T=\frac{p_1-p_2}{\sqrt{S^2(p_1-p_2)}} \overset{\mathcal{H}_0}{\rightarrow} \mathcal{N}(0, 1)\)

Critical region and P-value

Critical Region

  • \(q\): Quantile of \(\mathcal{N}(0, 1)\)

  • Two-tailed: \(\left(-\infty,-q_{1-\alpha/2}\right)\cup\left(q_{1-\alpha/2},+\infty\right)\)

  • Left-tailed: \(\left(-\infty,q_{\alpha}\right)\)

  • Right-tailed: \(\left(q_{1-\alpha},+\infty\right)\)

P-value

  • Two-tailed: \(pValue=2\mathbb{P}\left(T>|T_{obs}|\right)\)

  • Left-tailed: \(pValue=\mathbb{P}\left(T<T_{obs}\right)\)

  • Right-tailed: \(pValue=\mathbb{P}\left(T>T_{obs}\right)\)

Decision

Decision based on Critical Region

  • Reject \(\mathcal{H}_0\) if and only if \(T_{obs}\in W\)

Decision based on P-value

  • Reject \(\mathcal{H}_0\) if and only if \(pValue<\alpha\)