How to use Device Analysis

Metricuno

May 17, 2026

6 min read

How to use Device Analysis — How to split A/B test results by desktop, mobile, and tablet — when device analysis matters, what benchmarks to expect, and how to act on the findings.

Quick answer

A practical guide to splitting A/B test results by device — why mobile and desktop populations behave differently, and how to read the split without fooling yourself with noise.

Definition

Experiment Analysis

Device Analysis

Splitting A/B test results by desktop, mobile, and tablet to see how each device population responds to a variant.

Device analysis is the practice of breaking an experiment's results into desktop, mobile, and tablet segments instead of reading the aggregate lift. The three populations differ in intent, screen ergonomics, payment friction, and baseline conversion rate, so a winning aggregate result often hides a losing mobile experience — or vice versa.

It is the most consequential split in most online-retail tests. Mobile typically drives 60-75% of sessions but converts at half the rate of desktop, which means a small mobile regression can quietly cost more revenue than a strong desktop win recovers. Reading the device split is how you catch that before you ship.

Also known as

device segmentation

device-level analysis

device split

When you run an A/B test on a Shopify or WooCommerce store, the platform reports one aggregate conversion rate per variant. That number is a weighted average across whoever happened to visit during the test — usually a 70/30 mobile-desktop mix. A weighted average can move for the wrong reasons.

A variant that lifts desktop 8% and drops mobile 3% can look like a +1.5% winner in aggregate. Ship it, and you lose money on the majority of your traffic. Device analysis is how you separate those signals before the decision is final.

Why device populations behave differently

The first reason is intent. Mobile traffic skews toward discovery and social-referred browsing — short sessions, comparison shopping, abandoned carts that get rescued on desktop later. Desktop traffic skews toward checkout completion, repeat purchasing, and higher-AOV orders. The two devices are not the same funnel sampled twice.

The second is ergonomics. A product image that reads well on a 1440px screen becomes a tap target on a 390px screen. A sticky add-to-cart bar that frees desktop real estate covers half the mobile viewport. Layout changes are rarely neutral across devices — they trade one population's experience for another's.

The third is payment. Apple Pay, Google Pay, and Shop Pay collapse mobile checkout to two taps; on desktop the same flow can require manual card entry. Anything that touches checkout — express-pay placement, form fields, shipping selector — will produce a different lift on mobile than on desktop, almost by default.

Aggregate winners can be mobile losers

Roughly one in four tests we see that ship as 'winners' on aggregate are actually flat or negative on mobile. Because mobile is usually the larger traffic slice, the revenue impact of a missed mobile regression often exceeds the recorded desktop win.

How to set up a clean device split

Pre-register the split before the test starts. Decide at the planning stage that you will read desktop and mobile separately and what each one needs to do to pass — for example, mobile must be non-inferior (no worse than -1% with 95% confidence) and desktop must show a positive directional lift. Pre-registering keeps you honest later.

Power each segment, not just the aggregate. If your mobile baseline is 1.8% and your desktop baseline is 3.6%, the two segments need different sample sizes to detect the same relative lift. Most tests are sized for the aggregate, which underpowers the smaller segment — usually desktop — and produces results that look 'inconclusive' when they are actually just under-traffic.

Chart

Typical conversion-rate baseline by device (apparel & beauty stores)

Tablet is the awkward third category. It rarely has enough traffic to power its own read, and on most stores it behaves closer to desktop than to mobile. A defensible default is to roll tablet into desktop for the read and flag it separately only if it represents more than 5% of sessions.

What the numbers usually look like

The table below shows typical baselines across the splits worth tracking on a mid-sized Shopify store. The two patterns to internalise: mobile traffic share is 2-3x desktop, but desktop revenue share is often comparable or higher because of the AOV and conversion-rate gap.

Use these as orientation, not targets. Your own store's numbers will shift with vertical, paid-traffic mix, and how aggressively you promote express-pay options. The point is the relative shape: mobile is the volume engine, desktop is the conversion engine.

Benchmark

Typical device-level metrics for €1M-€15M Shopify stores

Metric	Mobile	Tablet	Desktop
Session share	68%	4%	28%
Conversion rate	1.8%	2.4%	3.6%
Average order value	€62	€71	€84
Revenue share	52%	4%	44%
Add-to-cart rate	8.2%	9.0%	11.5%
Checkout completion	62%	70%	78%

Notice the checkout-completion gap. A mobile-versus-desktop conversion delta usually opens up at checkout, not on the product page — which is where most teams over-invest test effort. If your device split keeps showing mobile drag, the highest-leverage place to test is the checkout flow itself, not the PDP.

Acting on what the split tells you

A clean device-segmented experiment analysis produces one of four verdicts: wins on both, wins on one and flat on the other, wins on one and loses on the other, or flat-to-negative on both. Each has a different shipping rule, and getting the rule right is what separates compounding test programs from ones that stall.

Wins-on-both is rare and obvious — ship it. Wins-on-one-flat-on-other is usually shippable if the win is on your larger revenue segment, but worth a quick follow-up test on the flat segment. Wins-and-loses is the hard one: ship a device-targeted version if your platform supports it, or kill the change and re-test with a mobile-specific hypothesis.

Device-targeted shipping is allowed

If a variant wins on desktop and loses on mobile, the right move is often to ship desktop-only and leave mobile on control while you design a mobile-native variant. This is not cherry-picking; it is honouring the segmented result. Shopify, WooCommerce, and most CRO platforms support device-conditional rendering natively.

Frequently asked

Device analysis FAQ

Always read the split, then read the aggregate as a sanity check. The aggregate is a weighted average across populations that behave differently, so it can hide a regression on the larger segment. Device analysis takes minutes to add and routinely changes the shipping decision.

Yes, if you want to make device-level shipping decisions. Calculate sample size for each segment using its own baseline conversion rate and traffic share. Most tests sized only on the aggregate end up under-powered on the smaller device segment.

Roll tablet into desktop for the analysis and note it in the test write-up. Tablet behaves closer to desktop than to mobile on most stores, and below 5% of sessions you will not get a statistically meaningful read on it within a normal test window.

Experiment analysis is the umbrella — significance, lift, segment reads, novelty effects, the full picture. Device analysis is one specific segment cut within it, and usually the highest-value one for online-retail tests because mobile and desktop populations diverge so much.

Yes. Device-conditional shipping is standard practice when the segmented result is clearly split. The alternative — shipping universally and accepting the mobile regression — is worse, because mobile is usually the larger revenue segment.

Rarely for layout tests, occasionally for payment tests (Apple Pay versus Google Pay availability changes checkout behaviour). Most stores do not have enough traffic on either OS alone to power a confident read, so default to a unified mobile segment unless your hypothesis is payment-specific.

Long enough for the slower-converting segment — usually mobile at the volume tier or desktop at the conversion tier — to hit its required sample size. In practice this is 2-4 weeks for most stores, and you should run for full weekly cycles to absorb weekday-weekend mix shifts.

Reading the split only after the aggregate shows a win, to confirm the decision. That introduces confirmation bias. Pre-register that you will read the split regardless of the aggregate result, and define each segment's pass criteria before the test starts.

Probably yes, if the desktop segment is properly powered and the mobile read is genuinely flat (not negative with wide confidence intervals). Ship desktop-only and use the mobile flat result as the brief for the next mobile-specific test.

Yes, and arguably more. Paid mobile traffic from Meta or TikTok behaves very differently from paid desktop traffic from Google Search, so landing-page tests on those audiences almost always need a device read. Skipping it conflates channel-quality differences with creative performance.

Test ideas before you ship them

Run unlimited A/B tests, attach hypotheses to outcomes, and build a searchable archive of what works — and what doesn't.

Launch your first experiment

How to use Device Analysis

Device Analysis

Why device populations behave differently

How to set up a clean device split

Typical conversion-rate baseline by device (apparel & beauty stores)

What the numbers usually look like

Typical device-level metrics for €1M-€15M Shopify stores

Acting on what the split tells you

Device analysis FAQ

When should I split a test by device versus reading the aggregate?

Do I need to power each device segment separately?

What if my tablet traffic is too small to read?

How is device analysis different from broader experiment analysis?

Can I ship a variant to desktop only if it loses on mobile?

Does iOS versus Android matter as a further split?

How long should a device-segmented test run?

What is the most common mistake teams make with device analysis?

Should I trust a 15% lift on desktop if mobile is flat?

Does device analysis matter for traffic-quality tests like paid acquisition?

Test ideas before you ship them