How to use PIE Framework

Metricuno

May 17, 2026

6 min read

How to use PIE Framework — The PIE framework scores CRO test ideas on Potential, Importance, and Ease. Learn how it works, when to use it over ICE, and how to score consistently.

Quick answer

PIE scores CRO test ideas on Potential, Importance, and Ease — swapping ICE's "confidence" for "importance" so teams argue about page strategy, not gut feel.

Definition

Experiment Prioritization

PIE Framework

A CRO prioritization model that scores test ideas on Potential, Importance, and Ease — popularised by WiderFunnel.

PIE is a scoring framework for ranking conversion-rate optimization experiments. Each idea gets a 1–10 score on three dimensions — Potential (how much lift is realistically available on this page), Importance (how strategically valuable that page is to the business), and Ease (how cheap it is to build, ship, and analyse). The three scores are averaged to produce a single priority number.

The framework was developed by WiderFunnel founder Chris Goward and described in his book You Should Test That!. It's closest cousin is ICE, but PIE replaces ICE's subjective "Confidence" score with "Importance" — a strategic lens that forces the team to argue about where effort matters most, not just where they feel lucky.

Also known as

Potential Importance Ease

PIE score

Goward PIE

If your test backlog has 60 ideas and your dev sprint has room for three, you need a way to choose that survives scrutiny from the head of e-commerce, the brand lead, and the developer who has to build it. PIE is one of the most durable answers to that problem.

It sits inside the broader practice of experiment prioritization, alongside ICE and PXL. PIE earns its place when your team disagrees less about whether an idea will work and more about whether the page is worth touching at all.

How the three scores work

Potential answers: how much room to improve is left on this page? A product detail page converting at 1.2% with a noisy layout has high potential. A checkout already at 78% completion does not — there's a ceiling, and you're close to it.

Importance answers: how much business value flows through this surface? A homepage that 80% of paid traffic lands on is high-importance even if its conversion rate is already decent. A help-centre article seen by 200 visitors a month is not.

Ease answers: how cheap is the test, end-to-end? That includes design, dev, QA across themes, mobile parity, the runtime needed to reach significance, and the analyst hours to read the result. A copy swap on a Shopify section scores 9. A full checkout re-architecture scores 2.

Score Ease honestly, including runtime

Teams routinely score Ease on build cost alone and forget that a page with 4,000 weekly sessions needs six weeks to detect a 10% lift. "Easy to build, slow to read" is still hard. Bake required test duration into the Ease score and your roadmap stops over-promising.

Scoring a real backlog

The mechanic is simple: every team member scores each idea 1–10 on the three dimensions, scores get averaged across reviewers, and the final PIE score is (P + I + E) / 3. Rank descending. The top of the list is your next sprint.

What matters more than the arithmetic is the calibration conversation. The first time a team scores together, you'll see a 6-point spread on Importance for the same page — that gap is the actual signal. It surfaces a disagreement about strategy that was hiding behind a disagreement about tests.

Chart

Sample PIE scores across a 6-idea backlog (apparel Shopify store)

Notice the spread. The top idea isn't twice as good as the bottom one in any single dimension, but combined across all three, the gap is decisive. That's PIE doing its job — turning a tied argument into a ranked queue.

PIE vs ICE: when to use which

ICE (Impact, Confidence, Ease) is the more common framework, especially in growth teams that came up through Sean Ellis-style methodology. Its "Confidence" score asks how likely a specific variant is to win. PIE's "Importance" asks something different: how much should we care about this page at all?

Use ICE when your team agrees on the roadmap but argues about which variants to ship. Use PIE when you're earlier in the process — when the question is which surfaces deserve research effort in the first place, before any specific variant has been designed.

Benchmark

PIE vs ICE: structural differences

Dimension	PIE	ICE
First score	Potential (lift available)	Impact (lift expected)
Second score	Importance (page strategic value)	Confidence (belief variant wins)
Third score	Ease (cost to ship + read)	Ease (cost to ship)
Best for	Choosing which pages to research	Choosing between drafted variants
Failure mode	Importance scores cluster at 7-8	Confidence inflates with seniority
Typical user	CRO consultants, agencies	In-house growth teams

Neither framework is statistically rigorous — both are forcing functions for conversation. The value isn't in the number; it's in the discussion that produces the number. A team that scores together every two weeks calibrates faster than one chasing a perfect formula.

Common scoring mistakes

The most common failure mode is conflating Potential with Importance. A low-traffic page with terrible UX has high Potential (lots of room to fix) but low Importance (few people see it). If you collapse the two, you'll prioritise fixing obscure pages over moving the homepage.

The second mistake is letting one loud voice dominate scoring. Have each scorer fill in their numbers privately, then reveal — the spread itself is the conversation. If the head of brand scores Importance as 9 and the analyst scores it as 4, that's a real strategic gap worth resolving before any test runs.

PIE is not a substitute for funnel data

Scoring Potential on intuition alone will land you on whichever page looked ugliest in the last design review. Anchor Potential in real drop-off rates — exit rate, scroll depth, micro-conversion gaps. If you've imported historical GA4 data, use it: the funnel tells you where the lift actually is.

Frequently asked

Frequently asked questions

Chris Goward, founder of the CRO agency WiderFunnel, formalised PIE in his 2012 book You Should Test That! It grew out of agency practice — a way to defend test prioritisation choices to clients with conflicting opinions about their own sites.

PIE scores Potential, Importance, and Ease. ICE scores Impact, Confidence, and Ease. The functional difference is the middle letter: PIE's Importance is about the page's strategic value, while ICE's Confidence is about belief in a specific variant winning. PIE works earlier in the process; ICE works once variants are drafted.

Either works mathematically — ranking is identical. Averaging on a 1-10 scale (result 1-10) is more intuitive to read than summing (result 3-30). WiderFunnel's original presentation uses the average. Pick one and stay consistent so historical scores remain comparable.

Three to five is the sweet spot — enough perspective to surface disagreement, few enough to make the meeting tractable. Include at least one person from analytics, one from design or product, and one from a commercial function. Solo scoring by the CRO lead defeats the purpose.

Re-score the top 10-15 ideas every two weeks alongside sprint planning. The full backlog can be re-scored quarterly. Scores drift as you learn — a Potential of 8 from January looks different after you've shipped two losing tests on the same page.

Yes — the framework is platform-agnostic. The only platform-sensitive dimension is Ease, since a copy change on Shopify is cheaper than the equivalent on a custom WooCommerce theme. Bake your platform's real cost-to-ship into the Ease score and the rest of the model translates directly.

No. PIE prioritises ideas you've already generated; it doesn't generate them. Pair it with qualitative research (session replay, surveys, support tickets) and quantitative funnel analysis. The framework ranks the output of research — it isn't a substitute for doing the research.

Most teams ship ideas scoring 7.0 or higher and shelve anything below 5.0. The 5.0-7.0 band is the interesting one — those ideas usually need either more research (to raise Potential) or a smaller scope (to raise Ease) before they're ready.

Use projected traffic from comparable pages and the page's role in the funnel. A new landing page tied to a paid campaign with €30k/month spend is high-importance regardless of current sessions. Score the future state, not the current one — but be ready to re-score after two weeks of real data.

PIE prioritises test ideas, not test types. Once an idea is at the top of the queue, the design choice between A/B, multivariate, and bandit comes down to traffic volume and the number of variants you want to compare. PIE tells you what to test; statistics tells you how.

Test ideas before you ship them

Run unlimited A/B tests, attach hypotheses to outcomes, and build a searchable archive of what works — and what doesn't.

Launch your first experiment

How to use PIE Framework

PIE Framework

How the three scores work

Scoring a real backlog

Sample PIE scores across a 6-idea backlog (apparel Shopify store)

PIE vs ICE: when to use which

PIE vs ICE: structural differences

Common scoring mistakes

Frequently asked questions

Who created the PIE framework?

What's the difference between PIE and ICE?

Should we average scores or sum them?

How many people should score each idea?

How often should we re-score the backlog?

Does PIE work for non-Shopify stores?

Can PIE replace a research process?

What's a good threshold for shipping a PIE-scored test?

How do I score Importance for a brand-new page?

Is PIE compatible with multivariate or bandit tests?

Test ideas before you ship them