Which ecommerce brands survive the AI shopping journey?
Single-turn citation rate isn't the same as business impact — a brand can be invisible in the first answer but show up later in the shopping conversation and still drive a sale. To close that gap, we built a four-stage shopping-journey study: discovery, filter, compare, purchase. For each of 43 ecommerce brands across 12 categories, we measured which ones survived all the way to the buy-now recommendation. Then we used the results to tune our scoring algorithm against an empirical commercial outcome instead of bare citation rate.
Key findings
Average survival rate is 36.9%. Across all 40 brands, each tested at 4 journey stages on 3 AI platforms, the typical brand was cited at 36.9% of those (stage × platform) opportunities. 27.5% of brands survived every stage with at least one citation per stage.
Citation rate decays as the user narrows. Discovery-stage queries pulled the most citations; purchase-intent queries pulled the fewest. Plenty of brands surface for "what are the best running shoes" but don't survive narrowing to "recommend running shoes for flat feet under $150" — exactly the kind of gap that separates presence from purchase impact.
The pillars that predict citation in informational queries don't predict survival in shopping queries — several of them point the wrong way. Pages high on Topic Authority, Answerability, and Multimedia tend to be recommended less in AI shopping flows than cleaner utility pages. We shipped a new Recommendation Readiness Score calibrated against this dataset for ecommerce-detected pages.
Citation rate decays through the funnel
Each stage asks a more specific question than the last. The narrower the query, the fewer brands AI cites — and the gap between discovery and purchase intent is where most brands lose the recommendation.
Strength decays even faster than presence
The "did you get cited?" line decays through the funnel — but the "how strongly are you being recommended?" line decays even faster. Brands that appear at discovery often get downgraded to neutral mentions or dropped entirely by the purchase stage.
What kind of mention each stage actually produces
Citations fall on a spectrum: actively recommended, neutrally listed alongside competitors, or not mentioned at all. As queries narrow toward purchase intent, the "actively recommended" share shrinks and the "not mentioned" share grows. That's where brands lose recommendation strength.
Negative mentions were rare across all stages — AI assistants tend to omit brands rather than disparage them.
Which pillars predict survival?
We tested all 12 GEO pillars against per-brand survival rate. The directional pattern is the headline finding.
Pillars where more was worse
Pages high on Topic Authority, Answerability, and Multimedia tended to be recommended less in AI shopping flows. Structure and E-E-A-T pointed the same direction with weaker effect.
Pillars where direction was flat or positive
Readability and Question Coverage showed effectively no relationship with survival. AI Accessibility was the only pillar with a positive direction, but with most brands already scoring near max, the signal was thin.
A discovery: the signals invert in shopping contexts
The most interesting thing this study uncovered isn't a tuning tweak — it's that the content signals AI rewards in informational queries do the opposite work in shopping queries. The same pillars that predict citation for "what is X" predict against recommendation for "recommend me X."
In informational AI search, high-authority and high-answerability pages get cited. Long-form, expert-reviewed content is the natural source for "what is high blood pressure" or "what is CRM software." Our earlier research confirmed this and built the Citation Readiness Score around it.
In AI shopping flows, that pattern flips. The 40 brands we tested showed that pages high on Topic Authority, Answerability, and Multimedia were recommended less often, not more, as users moved from discovery through to purchase intent. Heavily-SEO-optimized DTC landing pages — dense answer copy, expert bylines, video carousels, FAQ accordions — tended to lose to cleaner utility-style product pages from category-fit brands.
The pattern was consistent enough across 12 categories and 4 journey stages to be a finding, not a quirk of any single brand. Whatever AI shopping assistants are doing, they're doing something different from what AI research assistants are doing.
Product note: we've added a Recommendation Readiness Score to our scan results that captures this finding — it surfaces automatically when a page is detected as transactional, alongside the existing Citation Readiness Score for informational content. The weights live behind the API; this page presents the discovery, not the implementation.
Why the inverse direction?
The brands that survived the journey were not the high-GEO brands. Casper, Allbirds, Athletic Greens, and Glossier — all heavily-SEO-optimized DTC landing pages — scored zero across all stages. Meanwhile Chewy, Hoka, and Helix dominated their categories.
Our best read of why: the high-GEO pages in this set are textbook conversion-optimized landing pages — long hero, social proof bands, FAQ accordions, expert-reviewer bylines, the works. AI shopping flows appear to skip past those in favor of mentioning the brand by name and (sometimes) linking to simpler product or category pages. Heavily-answerable pages also tend to read like SEO bait, which the platforms may be down-ranking in shopping contexts.
The honest caveats: brand recognition is a huge confound we can't measure directly — Chewy survives partly because Chewy is the obvious answer for pet supplies, not because of anything on chewy.com. Some of our high-GEO brands have drifted from their original positioning (Allbirds no longer makes serious running shoes), so the AI is correctly not recommending them — that's category-fit, not page quality. And with 40 brands, individual results have meaningful uncertainty; read the overall pattern, not any single brand's number.
Despite those caveats, three independent rounds of our research (the original citation study, the E-E-A-T follow-up, and this ecommerce study) all point at the same thing: a high GEO Authority/E-E-A-T pillar score is not a reliable predictor of AI citation or recommendation. For ecommerce specifically, it appears to anti-correlate. That's a strong enough pattern to bake into the algorithm even with the caveats.
Survival by category
| Category | N brands | Avg survival | Reached purchase | Avg GEO |
|---|---|---|---|---|
| coffee subscription | 3 | 58.3% | 66.7% | 75.7 |
| meal kits | 4 | 56.3% | 50% | 78.8 |
| razors | 3 | 52.8% | 33.3% | 71.3 |
| pet supplies | 2 | 50% | 100% | 64.5 |
| minimalist skincare | 4 | 43.8% | 75% | 88 |
| bedsheets | 3 | 38.9% | 66.7% | 68.7 |
| running shoes | 4 | 33.4% | 25% | 75.8 |
| sustainable apparel | 4 | 33.4% | 75% | 80.5 |
| memory foam mattresses | 5 | 23.3% | 0% | 79.6 |
| dtc furniture | 4 | 20.8% | 25% | 60.3 |
| wellness supplements | 3 | 19.4% | 33.3% | 95 |
| eyewear | 1 | 0% | 0% | 63 |
Beyond binary: recommendation strength
Being mentioned isn't the same as being recommended — and being recommended isn't the same as being the top pick. We went back to the 475 AI responses we stored during this study and ran a follow-up classifier on each one to score how strongly each brand was actually being recommended, not just whether its name appeared.
Each AI response is now tagged with: how the brand was mentioned (actively recommended, neutrally listed, or negative), where in the answer it appeared, whether it landed in a top-3 list, and whether the AI included a buy/comparison link. Those dimensions combine into a 0-100 Recommendation Strength Score — a much closer proxy for "would-this-drive-a-sale" than the raw cited/not-cited signal.
What this lets us see that survival rate didn't: brands with similar survival rates can have very different strength scores. A brand named neutrally in a long list of competitors gets the same "cited" credit as a brand actively recommended at position #1 — but only one of those is going to drive a purchase. The strength score separates them.
The score is still an inferred proxy, not measured purchase data. We can't directly observe whether an AI recommendation translated to a sale. But it's the closest we can get using only the response data we have.
Survival rate vs recommendation strength, per brand
Each dot is one brand. The diagonal would be the perfect-agreement line — brands above it are recommended more strongly than their survival rate alone would suggest; brands below are cited a lot but rarely as a top pick. Hover to see which brand each dot is.
Recommendation behaviour by category
Survival (how often a brand was cited) and strength (how strongly when cited) ranked by category. Coffee subscriptions came in highest on both; supplements and apparel verticals were the toughest categories for any single brand to dominate.
Distribution & platform agreement
Two views of how concentrated AI shopping recommendations are: how strength scores distribute across the brand set, and how often all three platforms reached the same verdict.
Brands by strength bucket
Most brands cluster at the low end. No brand in our study reached a strength score above 80 — strong recommendation is rare even for well-known brands.
Cross-platform agreement
Of 158 (brand × stage) pairs, the three platforms reached the same verdict (yes or no) NaN% of the time. Platform-by-platform optimization is rarely necessary.
All 40 brands
Strength is the 0–100 follow-up classifier score; Top-3% is the fraction of checks where the brand landed in a top-3 ranked list.
| Brand | Category | GEO | Discovery | Filter | Compare | Purchase | Survival | Strength | Top-3% |
|---|---|---|---|---|---|---|---|---|---|
| atlascoffeeclub.com | coffee subscription | 77 | 100% | 100% | 100% | 100% | 100% | 77.1 | 90% |
| dollarshaveclub.com | razors | 74 | 100% | 66.67% | 100% | 66.67% | 83.3% | 52.3 | 58.3% |
| cerave.com | minimalist skincare | 100 | 100% | 33.33% | 100% | 66.67% | 75% | 54.7 | 58.3% |
| chewy.com | pet supplies | 14 | 100% | 33.33% | 66.67% | 100% | 75% | 48.3 | 41.7% |
| drinktrade.com | coffee subscription | 93 | 100% | 33.33% | 100% | 66.67% | 75% | 63.3 | 58.3% |
| hellofresh.com | meal kits | 61 | 66.67% | 100% | 33.33% | 100% | 75% | 60.3 | 58.3% |
| homechef.com | meal kits | 106 | 100% | 33.33% | 66.67% | 66.67% | 66.7% | 33 | 25% |
| hoka.com | running shoes | 25 | 100% | 33.33% | 66.67% | 66.67% | 66.7% | 36.8 | 41.7% |
| saucony.com | running shoes | 97 | 66.67% | 100% | 100% | 0% | 66.7% | 34.1 | 25% |
| burrow.com | dtc furniture | 55 | 100% | 33.33% | 100% | 0% | 58.3% | 42.9 | 33.3% |
| brooklinen.com | bedsheets | 59 | 0% | 100% | 100% | 33.33% | 58.3% | 42.3 | 33.3% |
| bollandbranch.com | bedsheets | 62 | 66.67% | 33.33% | 100% | 33.33% | 58.3% | 36.3 | 16.7% |
| helixsleep.com | memory foam mattresses | 20 | 100% | 33.33% | 66.67% | 0% | 50% | 40.3 | 50% |
| everlane.com | sustainable apparel | 77 | 0% | 0% | 100% | 100% | 50% | 26.7 | 25% |
| cetaphil.com | minimalist skincare | 83 | 100% | 0% | 66.67% | 33.33% | 50% | 31.3 | 33.3% |
| theordinary.com | minimalist skincare | 87 | 33.33% | 33.33% | 100% | 33.33% | 50% | 27 | 8.3% |
| sunbasket.com | meal kits | 84 | 100% | 33.33% | 66.67% | 0% | 50% | 17.8 | 8.3% |
| saatva.com | memory foam mattresses | 91 | 100% | 33.33% | 33.33% | 0% | 41.7% | 19.7 | 8.3% |
| patagonia.com | sustainable apparel | 80 | 100% | 0% | 33.33% | 33.33% | 41.7% | 19.9 | 25% |
| thereformation.com | sustainable apparel | 78 | 33.33% | 66.67% | 33.33% | 33.33% | 41.7% | 13.2 | 8.3% |
| harrys.com | razors | 53 | 66.67% | 33.33% | 66.67% | 0% | 41.7% | 50.3 | 50% |
| blueapron.com | meal kits | 64 | 100% | 0% | 33.33% | 0% | 33.3% | 19.1 | 16.7% |
| gillette.com | razors | 87 | 100% | 0% | 33.33% | 0% | 33.3% | 7.4 | 0% |
| humnutrition.com | wellness supplements | 83 | 0% | 100% | 0% | 33.33% | 33.3% | 23.9 | 25% |
| joybird.com | dtc furniture | 44 | 0% | 0% | 66.67% | 33.33% | 25% | 10.6 | 8.3% |
| purple.com | memory foam mattresses | 101 | 0% | 100% | 0% | 0% | 25% | 1.9 | 0% |
| bark.co | pet supplies | 115 | 0% | 0% | 0% | 100% | 25% | 22 | 25% |
| ritual.com | wellness supplements | 94 | 0% | 0% | 100% | 0% | 25% | 25.6 | 41.7% |
| parachutehome.com | bedsheets | 85 | 0% | 0% | 0% | 0% | 0% | 21.6 | 16.7% |
| nectarsleep.com | memory foam mattresses | 69 | 0% | 0% | 0% | 0% | 0% | 65.8 | 66.7% |
| liingoeyewear.com | eyewear | 63 | 0% | 0% | 0% | 0% | 0% | 0 | 0% |
| on.com | running shoes | 69 | 0% | 0% | 0% | 0% | 0% | 0 | 0% |
| cuyana.com | sustainable apparel | 87 | 0% | 0% | 0% | 0% | 0% | 0 | 0% |
| bluebottlecoffee.com | coffee subscription | 57 | 0% | 0% | 0% | 0% | 0% | 0 | 0% |
| article.com | dtc furniture | 63 | 0% | 0% | 0% | 0% | 0% | 20.3 | 16.7% |
| athleticgreens.com | wellness supplements | 108 | 0% | 0% | 0% | 0% | 0% | 0 | 0% |
| floydhome.com | dtc furniture | 79 | 0% | 0% | 0% | 0% | 0% | 22.5 | 25% |
| allbirds.com | running shoes | 112 | 0% | 0% | 0% | 0% | 0% | 0 | 0% |
| casper.com | memory foam mattresses | 117 | 0% | 0% | 0% | 0% | 0% | 0 | 0% |
| glossier.com | minimalist skincare | 82 | 0% | 0% | 0% | 0% | 0% | 0 | 0% |
Methodology
We picked 12 product verticals where AI assistants are plausibly part of the buying journey (running shoes, memory-foam mattresses, eyewear, minimalist skincare, sustainable apparel, coffee subscriptions, meal kits, pet supplies, DTC furniture, wellness supplements, razors, bedsheets). For each, we selected 3-5 well-known brands competing in that category — 40 brands in total.
For each category, we wrote four queries representing a typical shopping journey: discovery ("what are the best running shoes"), filter ("for flat feet and overpronation"), compare ("compare the top running shoes for stability and cushioning"), and purchase intent ("recommend the top 3 with links"). Every brand in a category was tested against the same four queries.
Each brand × stage was checked across ChatGPT, Perplexity, and Claude — 475 citation checks total. We compute per-stage citation rate and aggregate across stages to a per-brand survival rate. We then look at how each of the 12 GEO pillars relates to per-brand survival to find the strongest predictors.
The Recommendation Readiness Score weights are derived from those relationships. The score uses the same pillar scores the GEO score uses — only the weighting and direction are new.
Limitations. Concept queries don't capture conversational state. Each stage is an independent API call — we approximate a multi-turn conversation, not the real thing. Survival ≠ purchase. Being recommended at the buy-now stage is the closest proxy we can build, but actual purchase outcomes require attribution data we don't have. Brand recognition dominates. A small DTC brand with a great page may lose to a well-known competitor on every stage. Survival isn't purely a function of page quality.
Related research
Cross-study synthesis
Six findings that held across all of our research, with practical implications for what to optimize.
Read the synthesis →3-phase GEO citation study
61 sites across 17 industries. The starting point for our AI-citation research.
Read the original citation study →E-E-A-T & content type
Controlled 2×2 testing how E-E-A-T signals behave across content types.
Read the follow-up →