What inputs does the A/B Test Significance Calculator need?

It takes 5 inputs: controlVisitors, controlConversions, variantVisitors, variantConversions, alpha (default 5). Outputs returned: zScore, pValue, liftPercent, isSignificant.

What formula does the A/B Test Significance Calculator use?

The exact computation is: p_pooled = (xA + xB) / (nA + nB); SE = sqrt(p_pooled * (1 - p_pooled) * (1/nA + 1/nB)); z = (pB - pA) / SE; p_value = 2 * (1 - Φ(|z|))

Can I verify the A/B Test Significance Calculator with a worked example?

Yes. With A: 5,000 visitors / 250 conversions; B: 5,000 visitors / 300 conversions; alpha = 5%. the tool returns rateA = 5%, rateB = 6%, z ≈ 2.18, p ≈ 0.029, lift = 20%, significant at α = 5%.

Where does the A/B Test Significance Calculator get its benchmark data?

Reference data is sourced from: Agresti & Coull (1998), Approximate is Better than Exact for Interval Estimation (as of 1998).

What can the A/B Test Significance Calculator not tell me?

Known limitations: Peeking inflates false-positive rate. Fix the sample size up front or use a sequential-testing method (mSPRT, Bayesian bandit) instead. The folk claim that "90% of A/B tests are inconclusive" has no peer-reviewed source; we do not cite it. Power your experiments to detect a lift you would actually act on. Two-proportion z-test is unreliable for very small counts. For small n, use Fisher's exact test.

Methodology: A/B Test Significance Calculator

1. Scope

Runs a two-proportion z-test on binary conversion data and reports p-value, confidence interval, and observed lift. It is not a Bayesian engine and does not correct for peeking, multiple comparisons, or sequential analysis.

2. Inputs and outputs

Inputs

controlVisitors number
controlConversions number
variantVisitors number
variantConversions number
alpha percent default: 5

Significance threshold.

Outputs

zScore

Two-proportion z statistic.
pValue

Two-sided p-value.
liftPercent

(variantRate − controlRate) / controlRate.
isSignificant

True iff pValue < alpha.

Engine source: src/lib/ab-test-significance-calculator/engine.ts

3. Formula / scoring logic

p_pooled = (xA + xB) / (nA + nB)
SE       = sqrt(p_pooled * (1 - p_pooled) * (1/nA + 1/nB))
z        = (pB - pA) / SE
p_value  = 2 * (1 - Φ(|z|))

4. Assumptions

Samples are independent and randomly assigned.
Visitor counts are large enough for the normal approximation (rule of thumb: np ≥ 10 and n(1−p) ≥ 10 in both arms).
Two-sided test at a fixed alpha entered up front — no sequential-testing correction.

5. Data sources

Agresti & Coull (1998), Approximate is Better than Exact for Interval Estimation as of 1998

6. Known limitations

Peeking inflates false-positive rate. Fix the sample size up front or use a sequential-testing method (mSPRT, Bayesian bandit) instead.
The folk claim that "90% of A/B tests are inconclusive" has no peer-reviewed source; we do not cite it. Power your experiments to detect a lift you would actually act on.
Two-proportion z-test is unreliable for very small counts. For small n, use Fisher's exact test.

7. Reproducibility

Input
A: 5,000 visitors / 250 conversions; B: 5,000 visitors / 300 conversions; alpha = 5%.

Expected output
rateA = 5%, rateB = 6%, z ≈ 2.18, p ≈ 0.029, lift = 20%, significant at α = 5%.

8. Change log

2026-04-24 methodology page first published.

Worked example

Run live against the same engine this site ships (/engines/ab-test-significance-calculator.js). The inputs and outputs below are recomputed on every build and independently re-verified in CI — they are never hand-authored.

Input

tool: ab_test_significance
visitors_a: 5000
conversions_a: 250
visitors_b: 5000
conversions_b: 285
confidence_level: 95

Output

rateA: 0.05
rateB: 0.057
relativeLift: 14
zScore: 1.5554
pValue: 0.1199
conclusion: Not Significant
confidenceLevel: 95
requiredSampleSize: 16224
powerMessage: Not significant yet. Continue the test or increase traffic to reach a reliable conclusion.

Frequently asked questions

What does the A/B Test Significance Calculator calculate?: Runs a two-proportion z-test on binary conversion data and reports p-value, confidence interval, and observed lift. It is not a Bayesian engine and does not correct for peeking, multiple comparisons, or sequential analysis.
What inputs does the A/B Test Significance Calculator need?: It takes 5 inputs: controlVisitors, controlConversions, variantVisitors, variantConversions, alpha (default 5). Outputs returned: zScore, pValue, liftPercent, isSignificant.
What formula does the A/B Test Significance Calculator use?: The exact computation is: p_pooled = (xA + xB) / (nA + nB); SE = sqrt(p_pooled * (1 - p_pooled) * (1/nA + 1/nB)); z = (pB - pA) / SE; p_value = 2 * (1 - Φ(|z|))
Can I verify the A/B Test Significance Calculator with a worked example?: Yes. With A: 5,000 visitors / 250 conversions; B: 5,000 visitors / 300 conversions; alpha = 5%. the tool returns rateA = 5%, rateB = 6%, z ≈ 2.18, p ≈ 0.029, lift = 20%, significant at α = 5%.
Where does the A/B Test Significance Calculator get its benchmark data?: Reference data is sourced from: Agresti & Coull (1998), Approximate is Better than Exact for Interval Estimation (as of 1998).
What can the A/B Test Significance Calculator not tell me?: Known limitations: Peeking inflates false-positive rate. Fix the sample size up front or use a sequential-testing method (mSPRT, Bayesian bandit) instead. The folk claim that "90% of A/B tests are inconclusive" has no peer-reviewed source; we do not cite it. Power your experiments to detect a lift you would actually act on. Two-proportion z-test is unreliable for very small counts. For small n, use Fisher's exact test.