Experiment Velocity

Metricuno

May 17, 2026

4 min read

Experiment Velocity — Experiment velocity — tests run per month — is the strongest predictor of CRO program ROI. See benchmarks, the compounding formula, and how to raise it.

Quick answer

Experiment velocity is the number of tests your team ships per month. It compounds with win rate and average lift to determine almost all of your CRO program's annual return.

Definition

Experimentation

Experiment Velocity

The number of controlled experiments (A/B, multivariate, or split-URL tests) an organization ships per period, usually counted per month.

Experiment velocity measures throughput, not effort: a test only counts once it has reached its pre-registered sample size and produced a decision (win, loss, or inconclusive). Pages launched, hypotheses written, and tests still warming up don't count.

It matters because CRO returns compound across three multipliers — velocity, win rate, and average lift per winner. Doubling velocity from 2 to 4 monthly tests typically does more for annual revenue than chasing a higher win rate, because more shots on goal produce more winners and more learning per quarter. It's the single metric most predictive of program ROI.

Also known as

test velocity

testing cadence

experimentation throughput

Most online stores in the €1M–€15M revenue band run between 1 and 3 tests per month. The teams pulling away from that pack — usually 6 to 10 tests per month — aren't smarter; they've removed the bottlenecks that turn a hypothesis into a shipped variant.

Those bottlenecks are almost always the same four: dev queue for variant code, ambiguous traffic allocation between concurrent tests, slow time-to-significance on low-traffic pages, and a hypothesis backlog written from gut feel rather than real funnel drop-off. Fix those and velocity roughly doubles within a quarter.

Formula

Annual Revenue Lift ≈ Baseline Revenue × (1 + Velocity × Win Rate × Avg Lift)^12 − Baseline Revenue

Variables

Velocity

Experiment Velocity

Tests concluded per month

Win Rate

Share of concluded tests that produce a statistically significant winner

Avg Lift

Average Lift per Winner

Mean conversion-rate improvement of winning variants, expressed as a decimal

Baseline Revenue

Monthly Baseline Revenue

Revenue from the surface being tested before any wins are deployed

Worked example

A Shopify apparel store doing €500,000/month in product-page revenue ships 4 tests per month, with a 25% win rate and 5% average lift per winner.

Velocity: 4 tests/month

Win Rate: 25%

Avg Lift: 5%

Baseline monthly revenue: €500,000

→ Roughly €395,000 in incremental annual revenue from the compounded ~6.6% effective monthly improvement rate.

Halving velocity to 2 tests/month with the same win rate and lift cuts the incremental return to ~€190,000 — the velocity multiplier matters more than the lift multiplier at typical DTC ranges.

Use the benchmarks below to locate your program. The figures reflect tests that actually concluded with a decision — not launched, not paused, not still gathering data. Counting launches inflates velocity and disguises a backlog of inconclusive tests.

Benchmark

Monthly experiment velocity benchmarks for online retail by revenue tier

Annual revenue	Lagging	Median	Top quartile
€1M – €3M	0–1 tests/mo	1–2 tests/mo	3–4 tests/mo
€3M – €7M	1–2 tests/mo	2–4 tests/mo	5–7 tests/mo
€7M – €15M	2–3 tests/mo	4–6 tests/mo	8–12 tests/mo
€15M+	3–5 tests/mo	6–10 tests/mo	15+ tests/mo

If you're below the median for your tier, the fix is rarely "more hypotheses." It's removing the dev dependency on variant code, prioritising pages with enough traffic to reach significance in under two weeks, and grounding the backlog in observed funnel drop-off rather than opinion. Velocity is an operational metric — treat it like one and review it weekly alongside your broader experimentation strategy.

Frequently asked

Frequently asked questions

Count the number of tests that reached their predetermined sample size and produced a decision (winner, loser, or inconclusive) within the period — usually a month. Tests still running or stopped early don't count. Most teams average the trailing three months to smooth out launch lulls.

Median for that revenue tier is 2–4 concluded tests per month; top-quartile programs ship 5–7. Below 2/month and you're leaving compounding returns on the table; above 7/month you'll need real traffic discipline to avoid concurrent-test interference.

At typical online-retail win rates (15–30%), yes. Doubling velocity from 2 to 4 monthly tests adds more annual revenue than pushing win rate from 25% to 35%, because more shots on goal also generate more learnings that improve future hypotheses.

The leverage points are tooling and prioritisation, not headcount. Use a visual editor for variant code (eliminates dev queue), focus on high-traffic surfaces that conclude in under two weeks, and kill your bottom-quartile hypothesis backlog so the team only builds tests with a credible mechanism.

Yes. Running concurrent tests on overlapping surfaces creates interaction effects that bias results. The practical ceiling on a single conversion funnel is around 3–4 simultaneous tests, with non-overlapping audiences or pages. Above that, segment by traffic source or route.

Yes, but you'll measure it differently. Stores under ~50,000 monthly sessions usually can't conclude more than 1–2 tests per month at standard 95% confidence. Either test bigger changes (10%+ expected lift), test higher in the funnel where traffic is larger, or accept slower cadence.

Velocity is the operational output; experimentation strategy decides what to test and why. A high-velocity program with a weak strategy ships lots of inconclusive tests on low-impact surfaces. A great strategy with low velocity is a slideshow. You need both, measured separately.

A launched test is live; a concluded test has a decision recorded against it. Only the latter counts toward velocity. Teams that report on launches tend to accumulate a long tail of underpowered, never-decided tests that look like activity but produce no learnings or wins.

Winners typically deploy 1–2 weeks after a test concludes, then compound monthly. Expect a visible revenue signal 60–90 days after raising velocity, and a clear year-over-year delta after 6 months of sustained higher cadence.

Yes — alongside win rate and average lift, never alone. Velocity reported in isolation incentivises shipping low-quality tests to hit a number. The honest scorecard is velocity × win rate × avg lift, which maps directly to revenue impact and is harder to game.

Test ideas before you ship them

Run unlimited A/B tests, attach hypotheses to outcomes, and build a searchable archive of what works — and what doesn't.

Launch your first experiment

Experiment Velocity

Experiment Velocity

Monthly experiment velocity benchmarks for online retail by revenue tier

Frequently asked questions

How is experiment velocity calculated?

What's a good experiment velocity for a Shopify store doing €5M/year?

Is experiment velocity more important than win rate?

How do I increase experiment velocity without burning out my team?

Can you run too many tests at once?

Does experiment velocity apply to low-traffic stores?

How does experiment velocity fit into a broader experimentation strategy?

What's the difference between launched tests and concluded tests?

How long until increased velocity shows up in revenue?

Should agencies report experiment velocity to clients?

Test ideas before you ship them