How ChatGPT Decides Who to Recommend
ChatGPT retrieves far more pages than it ever shows you. A 2026 study found that 85% of pages ChatGPT discovers never appear in the final answer. Here's exactly how the selection happens — and how to be in the 15% that makes it.
TL;DR
ChatGPT citation is a three-stage funnel: Discovery (does it find you?), Expansion (do you survive follow-up research?), Selection (does your content earn a spot in the final answer?). Most brands fail at Stage 2. The fix: content that front-loads the answer, follows H1→H2→H3 structure, and lives on pages that rank in Google's top 20.
The Study That Changed How We Think About AI Citations
In 2026, AirOps analyzed 548,534 pages retrieved across 15,000 ChatGPT prompts. The headline finding: 85% of pages ChatGPT discovers during its research process never appear in the final answer.
That means if you're ranking on Google and getting found by ChatGPT, you still have an 85% chance of being cut before the user ever sees your name.
Understanding why that happens — and how to beat the odds — is the entire game of AEO.
The Three-Stage Citation Funnel
ChatGPT's citation process isn't one decision. It's a funnel with three distinct stages, and you can be eliminated at any of them.
Stage 1: Discovery — Does ChatGPT Find You?
When a user submits a prompt, ChatGPT (with browsing enabled) performs one or more Google searches to find relevant pages. This is where the retrieval pool is built.
What determines if you're discovered:
- Google ranking: 50% of ChatGPT citations come from the #1 Google result. Pages outside the top 20 are cited 3.5× less often. If you don't rank on page one, you often don't even get retrieved.
- Query match: ChatGPT searches using reformulated versions of the user's prompt. Your content needs to match the buyer language, not just your internal terminology.
- Domain authority: 74% of citations go to sites with Domain Authority under 80 — so you don't need to beat Salesforce, but you do need a credible site.
Implication: Traditional SEO still matters for AEO. Not because Google rank translates directly to AI rank, but because Google rank is how you get into ChatGPT's retrieval pool in the first place. AEO without SEO is building on quicksand.
Stage 2: Expansion — Do You Survive Follow-Up Research?
This is where most brands get eliminated — and most people don't know this stage exists.
After the initial retrieval, 89.6% of prompts trigger two or more follow-up searches. ChatGPT is actively researching the topic from multiple angles before it decides what to include in the final answer.
What follow-up searches look like in practice:
- Initial query: "best AEO tools for agencies"
- Follow-up 1: "[specific brand] reviews"
- Follow-up 2: "[specific brand] pricing 2026"
- Follow-up 3: "alternatives to [competitor]"
If you appear in the initial retrieval but have nothing for the follow-up queries, you get dropped. ChatGPT has found three more pages about your competitor and nothing new about you.
How to survive Stage 2:
- Publish comparison pages ("[Your brand] vs [competitor]")
- Make your pricing visible and crawlable
- Earn third-party reviews that ChatGPT can retrieve as validation
- Build FAQ content that answers the natural follow-ups to your category queries
Stage 3: Selection — Does Your Content Earn a Spot?
By Stage 3, ChatGPT has a pool of relevant pages. Now it decides which 15% to actually cite.
Two content signals dominate selection:
1. Answer position on the page
44.2% of citations come from content in the first 30% of the page. ChatGPT weights earlier content more heavily — it's a proximity-to-the-top signal, not just a relevance signal.
The fix is simple: answer first, explain second. If your page is about "best AEO tools for law firms", your recommendation should be in paragraph one, not paragraph seven after 400 words of preamble.
2. Heading hierarchy
68.7% of cited pages follow a logical H1→H2→H3 structure. This isn't about aesthetics — it's how ChatGPT parses the semantic structure of a page to understand what each section covers.
Pages that use headings inconsistently (skipping H2→H4, or using bold text instead of real headings) are harder for the model to parse and get cited less often.
Why Your Google #1 Ranking Doesn't Guarantee AI Visibility
This is the most common misconception we see. A brand ranks #1 on Google, assumes they're visible in AI, runs a manual test — and finds ChatGPT recommending their #7-ranked competitor.
Here's why that happens:
- Google rank gets you discovered. It doesn't get you cited. The page that ranks #1 might be your category page — optimized for a broad keyword, not for answering the specific follow-up questions ChatGPT asks.
- AI models weight third-party validation more than your own pages. Your homepage saying "We're the #1 choice for X" counts for less than a G2 review page saying the same thing.
- Your competitor may have more answer-dense content. A competitor ranking #7 on Google but with a detailed FAQ, comparison pages, and 200 G2 reviews often beats a #1-ranked brand with a thin homepage and no third-party coverage.
What ChatGPT Weighs Beyond the Three Stages
Once content makes it into the selection pool, ChatGPT also evaluates:
Entity Clarity
Is your brand a clearly defined entity? Consistent NAP (Name, Address, Phone) across the web, schema markup on your site, and clear category classification all help ChatGPT build a confident model of who you are. Brands with entity ambiguity (similar names, unclear positioning) get recommended less confidently — or skipped entirely.
Sentiment in Third-Party Sources
ChatGPT doesn't just check if you're mentioned — it incorporates the tone of mentions. A review page that says "good for small teams, limited enterprise features" teaches ChatGPT to append that qualification when recommending you to enterprise buyers. This is why managing your narrative in external sources matters.
Content Freshness
For queries with "2026" or "latest" in them, ChatGPT weights recently published or updated content more heavily. A blog post from 2023 with accurate information can lose to a 2026 article with the same content, simply because it's newer.
Specificity Over Vagueness
Specific claims beat vague ones. "We've helped 200+ law firms improve their AI visibility by 40% in 90 days" is more citable than "We help professional services firms grow their online presence." Numbers, named outcomes, and named customers give ChatGPT concrete content to quote.
The Practical Playbook
Based on how the three-stage funnel works, here's the highest-ROI sequence:
Week 1-2: Foundation (Stage 1)
- Audit your Google rankings for buyer-intent queries
- Add Organization schema markup to your homepage
- Deploy llms.txt (direct brief for AI models)
- Check robots.txt allows GPTBot, PerplexityBot, CCBot
Week 3-4: Expansion Coverage (Stage 2)
- Publish 1-2 comparison pages ("[Your brand] vs [competitor]")
- Make pricing visible and specific on a dedicated page
- Build a FAQ section with 15+ buyer questions
- Ask 5 customers to leave reviews on G2 or Trustpilot
Week 5-6: Content Structure (Stage 3)
- Audit your top 5 pages: does the answer appear in the first 30%?
- Verify H1→H2→H3 hierarchy on all key pages
- Add a summary box at the top of long-form content
- Update dates on all core content ("Last updated: June 2026")
Ongoing: Measurement
- Test 10-20 buyer prompts on ChatGPT weekly
- Track mention rate, position, and sentiment over time
- Note which competitor gets cited when you don't — then close that gap
See Where You're Getting Dropped
Run a free Decyde audit and get your AI Visibility Score across ChatGPT, Perplexity, Gemini, and Google AI Overviews — plus a ranked action plan showing exactly what to fix first.
Run Free Audit →Frequently Asked Questions
Q: Does ChatGPT without browsing (GPT-4o base) use the same process?
A: No. The base model without browsing relies entirely on training data — it can't retrieve pages in real time. The three-stage funnel above applies to ChatGPT with browsing (Plus/Pro users) and to retrieval-augmented versions. For base model citations, the only lever is ensuring your brand is well-represented in training data: press coverage, G2 reviews, Wikipedia, and high-traffic pages that get indexed by CommonCrawl.
Q: How long does it take to see improvements after making changes?
A: For live-retrieval models (ChatGPT Plus with browsing, Perplexity), changes to your content can be reflected in 1-4 weeks. For base model training data, the cycle is much longer — months to years depending on when the next training run incorporates new data.
Q: Can I force ChatGPT to cite me?
A: No. But you can make it significantly more likely. The brands that consistently get cited are those with strong Google rankings for buyer queries, dense third-party coverage, and content structured to answer questions directly. There's no shortcut — but the fundamentals are clear.
Q: Does domain authority matter more than content quality?
A: Both matter, but 74% of citations go to sites with DA under 80. This means content quality and structure often wins against higher-DA sites with thin content. A focused, well-structured page on a mid-authority domain frequently beats a generic page on a high-authority domain.