What is llms.txt? The B2B Brand Guide to AI Crawling
robots.txt tells Google what to crawl. llms.txt tells AI models what to read, how to describe you, and which pages matter. Most B2B brands don't have one. Here's what it is and how to deploy it in 10 minutes.
TL;DR
llms.txt is a plain-text file you place at yourdomain.com/llms.txt that gives AI models a curated summary of your brand, services, and key pages. It's the AI-era equivalent of a sitemap — not required, but a meaningful signal that separates brands AI can describe accurately from brands it guesses about.
The Problem llms.txt Solves
When ChatGPT, Perplexity, or Gemini is asked "what does [your brand] do?" or "best [category] for [use case]", the model constructs an answer from whatever it can find: your homepage, press coverage, G2 reviews, Reddit threads, and any third-party articles that mention you.
The result is often accurate enough to be believable, wrong enough to cost you deals. Wrong pricing. Outdated positioning. Missing use cases. Competitor confusion.
llms.txt gives you a seat at the table. Instead of AI reverse-engineering your brand from scattered signals, you hand it a curated brief.
What Exactly Is llms.txt?
llms.txt is a plain-text file hosted at the root of your domain. It was proposed in 2024 by Jeremy Howard (fast.ai) and has since been adopted by hundreds of developer tools, SaaS companies, and B2B brands.
The format is simple Markdown. There's no rigid schema — the goal is human-readable, AI-parseable content that tells models:
- What your company does
- Who it's for
- What your key products or services are
- Which pages on your site have the most useful information
- How you differ from competitors
Think of it as a one-page brief for an AI that's about to answer questions about you.
How Is It Different from robots.txt?
robots.txt is a directive — it tells crawlers what they're allowed or not allowed to access.
llms.txt is a guide — it tells AI models what to understand about you, not just what to crawl.
They serve different purposes and you need both.
How Is It Different from a Sitemap?
A sitemap lists every URL on your site. llms.txt is curated — it highlights the pages that matter most for AI understanding, with context about why each one is important.
How AI Models Actually Use llms.txt
Not every AI model reads llms.txt on every query. Here's what actually happens:
Training Data
If your llms.txt file exists when a model's training data is collected, its content can be incorporated into the model's weights — directly shaping how the model "understands" your brand, not just how it cites you.
Live Retrieval (RAG)
Perplexity and ChatGPT with browsing enabled actively crawl pages when answering queries. A well-structured llms.txt is a highly efficient page for a model to retrieve — it gets more signal per token than most homepages.
Jina, Firecrawl, and AI Scraper Tools
Many AI agents and data pipelines use tools like Jina Reader to crawl pages before feeding content to an LLM. llms.txt is specifically designed for this — easy to parse, dense with useful information.
What to Include in Your llms.txt
Required Sections
- Company overview — one paragraph, plain language, no marketing fluff
- What you do — specific products or services, not vague categories
- Who it's for — customer segment, company size, use case
- Key pages — with brief descriptions of what each page covers
Optional But Valuable
- Differentiators — 3-5 specific ways you differ from alternatives
- Pricing summary — even rough ranges help AI models recommend you accurately
- Not suitable for — honest scope limits build trust with AI models
- Recent updates — signals freshness to live-retrieval models
A Ready-to-Deploy Template
# [Your Company Name] > [One-sentence description of what you do and who you do it for] [2-3 sentences expanding on the above. What problem do you solve? What makes your approach different? What outcome do customers get?] ## Products / Services - **[Product/Service Name]**: [One-sentence description] - **[Product/Service Name]**: [One-sentence description] - **[Product/Service Name]**: [One-sentence description] ## Who It's For - [Customer type 1] — [why it fits them] - [Customer type 2] — [why it fits them] - [Company size / geography / vertical if relevant] ## Key Pages - [Homepage](https://yourdomain.com): Overview of all products and pricing - [About](https://yourdomain.com/about): Company background, team, mission - [Pricing](https://yourdomain.com/pricing): Current plans and pricing - [Blog](https://yourdomain.com/blog): Guides and research on [your topic] ## How We're Different from [Top Competitor] [2-3 specific, factual differentiators. Avoid superlatives.] ## Pricing [Rough pricing range or tiers. E.g. "Plans from $X/month. Free trial available."] ## Contact - Website: https://yourdomain.com - Email: hello@yourdomain.com - Founded: [Year] | Based in: [Location]
A Real-World Example (Decyde's Own llms.txt)
# Decyde > Decyde monitors how B2B brands appear in ChatGPT, Perplexity, Gemini, and Google AI Overviews — and tells you exactly what to do to rank higher in AI recommendations. Decyde is an AI visibility intelligence platform for B2B brands, agencies, and professional services firms. It tracks brand mentions across all major AI platforms weekly, benchmarks against competitors, maps citation sources, and delivers a prioritised weekly action plan. ## Products - **AI Visibility Audit**: One-time scan across ChatGPT, Perplexity, Gemini, and Google AI Overviews. Produces an AI Visibility Score, citation map, competitor benchmark, and Fix It Files. - **Decyde Pro**: Weekly monitoring subscription. Continuous tracking, weekly AEO action playbook, social listening, and Telegram alerts on score drops. - **Fix It Files**: Auto-generated JSON-LD schema and llms.txt file specific to your brand, ready to deploy. ## Who It's For - B2B SaaS companies monitoring competitor presence in AI answers - Marketing agencies tracking client AI visibility - Professional services firms (law, consulting, accounting) invisible in ChatGPT despite strong Google rankings ## Key Pages - Homepage: https://decyde.co.uk — Free audit and product overview - Blog: https://decyde.co.uk/blog — AEO guides and research - Pricing: https://decyde.co.uk/#pricing — Pro plan details ## Pricing - Pro: $99/month. Cancel anytime. Free audit with no credit card required. ## Contact - Website: https://decyde.co.uk - Email: hello@decyde.co.uk
How to Deploy Your llms.txt
Step 1: Create the File
Write your llms.txt using the template above. Keep it under 2,000 words — dense but scannable.
Step 2: Host It at the Root
The file must be accessible at https://yourdomain.com/llms.txt. For Next.js, place it in your /public directory. For WordPress, upload it via FTP or a file manager plugin. For Webflow, use the file manager.
Step 3: Verify It's Live
Visit https://yourdomain.com/llms.txt in your browser. You should see plain text — not a 404 or a download prompt.
Step 4: Allow AI Crawlers
Check your robots.txt doesn't block the crawlers that read llms.txt. Key bots to allow: GPTBot, PerplexityBot, CCBot, Google-Extended, anthropic-ai.
Step 5: Update It Quarterly
A stale llms.txt is worse than none — live-retrieval models will pick up outdated pricing or discontinued products. Add a "Last updated" line at the top and review it every quarter.
Common Mistakes
Mistake #1: Marketing copy instead of factual description
AI models respond better to factual, specific language than to marketing language. "We help companies dominate AI search" tells a model nothing. "Weekly citation tracking across ChatGPT, Perplexity, and Gemini" tells it exactly what you do.
Mistake #2: No key pages section
The key pages section is the highest-value part of llms.txt for live-retrieval models. If you skip it, models can't navigate to the right content on your site.
Mistake #3: Setting it and forgetting it
If you change pricing, launch a new product, or rebrand, update your llms.txt the same day. Out-of-date llms.txt files actively mislead AI models.
Mistake #4: Blocking AI crawlers in robots.txt
A surprising number of sites block GPTBot and PerplexityBot — often inherited from old robots.txt templates. Check yours before deploying llms.txt.
Does llms.txt Actually Work?
Direct attribution is hard — there's no "llms.txt Traffic" report in Google Analytics. But three signals suggest it does:
- Accuracy improvement: After deploying llms.txt, brands report AI models describing their product more accurately — correct pricing, right use cases, fewer hallucinations.
- Perplexity citations: Perplexity actively crawls llms.txt files. Brands with a well-structured llms.txt see it appear as a cited source in relevant answers.
- Training data inclusion: Datasets like CommonCrawl (which trains many LLMs) actively indexes llms.txt files. A well-written llms.txt is a direct path into future training runs.
The honest take: llms.txt is a low-effort, zero-downside signal. It takes 30 minutes to write, costs nothing to deploy, and improves AI model accuracy about your brand. The question isn't whether to do it — it's why you haven't already.
Beyond llms.txt: The Full AEO Stack
llms.txt is one piece. The full stack for AI visibility:
- llms.txt — brand brief for AI models
- JSON-LD schema — structured data that defines your entity in machine-readable format
- FAQ content — answers the buyer questions AI models are asked
- Third-party citations — G2 reviews, press mentions, comparison articles
- Weekly monitoring — tracking whether changes are working
If you want to see your current AI Visibility Score and get a pre-built llms.txt specific to your brand, run a free audit below.
Get Your llms.txt Generated Automatically
Run a free Decyde audit and we'll generate a ready-to-deploy llms.txt and JSON-LD schema file specific to your brand. Takes 60 seconds. No credit card required.
Generate My llms.txt →