Statistical significance test A/B test
.webp)
A statistical analysis method used to determine whether the results observed in a test (such as an A/B test) are sufficiently reliable to be attributed to a real effect rather than to chance or random fluctuation. It is a fundamental tool for validating or rejecting a hypothesis, based on quantitative data.
🎯 Objective:
Assess whether the difference between two (or more) variants is statistically significant, i.e. unlikely to be due to chance. This enables well-founded decisions to be made, and minimizes false positives (wrongly concluding that a variation is better).
🔍 Operation :
The test is based on two assumptions:
- Null hypothesis (H₀): there is no real difference between the variants tested.
- Alternative hypothesis (H₁): there is a significant difference.
A p-value is then calculated: the probability of obtaining a difference at least as great as that observed, if the null hypothesis were true.
→ If the p-value is below the significance level (usually 0.05), we reject the null hypothesis → the difference is considered statistically significant.
📊 Associated indicators :
- P-value: measure of surprise; the lower it is, the more potentially real the effect.
- Confidence level (often 95%): probability of being right in concluding a difference.
- Statistical power: ability of a test to detect a real effect if it exists, often set at 80% or more.
- Minimum Detectable Effect (MDE): the smaller the expected effect, the more traffic is needed to reach a conclusion.
🧪 CRO application:
In an A/B or multivariate test, the statistical significance test is essential for :
- validate a variation as "winning" or not,
- avoid interpretation errors due to noise effects or tests stopped too early(p-hacking),
- ensure that results are generalizable, not simply linked to a particular period or segment.