Opportunity Scoring

Opportunity Scoring is a survey-based research method that ranks customer needs by importance and satisfaction — surfacing under-served jobs that should feed your A/B test backlog.
Opportunity Scoring
A survey method that ranks customer needs by importance and satisfaction to surface under-served opportunities worth testing.
Opportunity Scoring is a research technique — popularised by Tony Ulwick's Outcome-Driven Innovation and adapted by Sean Ellis for growth teams — that asks customers to rate how important a given outcome is and how satisfied they currently are with it. Outcomes that score high on importance and low on satisfaction are the under-served jobs worth fixing.
It sits upstream of experiment prioritization frameworks like ICE or PIE. Those rank test ideas you already have; Opportunity Scoring generates the ideas in the first place by pointing at where the gap between expectation and reality is widest.
Most CRO backlogs fail in the same way: a long list of clever test ideas with no anchor in customer reality. Teams argue about which variant to ship next, when the real problem is that no one has asked which job the visitor came to do and how badly the store is doing that job today.
Opportunity Scoring fixes that asymmetry. You survey recent buyers and abandoners, ask them to rate 10-20 specific outcomes on importance and satisfaction (both on 1-10 scales), then compute an opportunity score for each. High-scoring outcomes become hypotheses for the test backlog — and because they're rooted in stated customer demand, the win rate when you do test them tends to be higher than a backlog seeded by gut feel.
Opportunity = Importance + max(Importance − Satisfaction, 0)
Importance
Importance score
Average rating (1-10) of how important the outcome is to the customer.
Satisfaction
Satisfaction score
Average rating (1-10) of how well the customer's current solution delivers on that outcome.
Opportunity
Opportunity score
Composite score, typically 0-20. Scores above 12 signal under-served outcomes worth testing against.
A Shopify apparel brand surveys 240 recent shoppers about the outcome 'know how a garment will actually fit before buying'.
Average importance: 9.1
Average satisfaction: 4.8
→ Opportunity score = 13.4
A score above 12 means this outcome is under-served. The team turns it into three concrete test hypotheses — adding model-height-and-size captions, surfacing user fit reviews on the PDP, and offering a size-recommendation quiz — and queues them for the next experimentation cycle.
The max() in the formula matters: when satisfaction is already at or above importance, the gap term collapses to zero and you're left with the bare importance score. That stops the framework from rewarding outcomes where you're already doing fine — you only get the bonus when there's a genuine gap to close.
How to read opportunity scores
| Score range | Interpretation | What to do |
|---|---|---|
| 15 - 20 | Severely under-served | Top-of-backlog hypotheses; high expected lift. |
| 12 - 15 | Under-served | Strong candidates for the next test cycle. |
| 10 - 12 | Appropriately served | Monitor; not worth a dedicated experiment yet. |
| < 10 | Over-served or low-importance | Deprioritise — fixing this won't move the needle. |
One word of caution: the score is only as good as the outcome statements you put in front of customers. Phrase them in the customer's language and tie them to a job to be done — 'feel confident the size will fit' beats 'PDP information density' every time. Aim for at least 100-200 responses per segment before you trust the rankings.
Opportunity Scoring FAQ
ICE and PIE are experiment prioritization frameworks — they rank test ideas you already have by impact, confidence, and ease. Opportunity Scoring is a research method that produces the ideas in the first place by identifying under-served customer outcomes. The two are complementary: Opportunity Scoring feeds the backlog, ICE orders it.
It originates from Tony Ulwick's Outcome-Driven Innovation work in the early 2000s and was popularised in growth circles by Sean Ellis. The Importance + max(Importance − Satisfaction, 0) structure ensures you reward high-importance outcomes and penalise nothing when satisfaction already meets demand.
Aim for at least 100-200 responses per meaningful segment (new vs returning buyer, mobile vs desktop). Below 100 the segment averages get noisy and you risk chasing artefacts. If you only have one undifferentiated audience, 200-300 total is a safe minimum.
Both, and separately. Buyers tell you which outcomes your current funnel already delivers on; abandoners tell you which gaps are large enough to lose the sale. The most actionable opportunities usually show up as low-satisfaction outcomes among abandoners with high importance among buyers.
Phrase them as customer jobs, not site features. 'Know how the garment will fit before I buy' is an outcome; 'see size chart' is a feature. Aim for 10-20 outcome statements covering discovery, evaluation, checkout, and post-purchase. Pull candidate outcomes from session recordings, support tickets, and review mining.
Yes — a simple on-site or post-purchase email with two rating scales per outcome works. Typeform, Hotjar surveys, and Klaviyo all handle it. The scoring itself is a spreadsheet calculation, so no specialised tool is required to compute the result.
Once or twice a year for the overall store, or after a major redesign, catalogue expansion, or pricing change. Outcomes shift slowly — running it monthly produces noise rather than signal. Use shorter pulse surveys in between to track whether satisfaction on your top opportunities is moving.
That usually means your importance scale is anchored too high — respondents are rating everything 9 or 10. Tighten the outcome list, reword for specificity, or force-rank by asking customers to allocate 100 points across outcomes. The point is to discriminate, not to confirm everything matters.
Opportunity Scoring sits at the discovery stage of experiment prioritization. Top-scoring outcomes generate hypotheses, those hypotheses get scored with ICE or PIE for sequencing, then the winners enter your test calendar. Skipping the discovery step is why so many backlogs collapse into recency bias and HiPPO requests.
Yes — and arguably better, because low-traffic stores can't generate statistical significance on most A/B tests anyway. Using Opportunity Scoring to pick a small number of high-conviction changes, then shipping them as non-tested improvements, is a sensible strategy under roughly 50k monthly sessions.
Get an AI expert review of your site
Paste your URL — Metricuno's AI runs the same heuristic checks a senior CRO consultant would, scoring your page and prioritising the fixes that'll move conversion fastest.