Last Updated: June 2026 · 7 min read

What is llms.txt? The B2B Brand Guide to AI Crawling

robots.txt tells Google what to crawl. llms.txt tells AI models what to read, how to describe you, and which pages matter. Most B2B brands don't have one. Here's what it is and how to deploy it in 10 minutes.

TL;DR

llms.txt is a plain-text file you place at yourdomain.com/llms.txt that gives AI models a curated summary of your brand, services, and key pages. It's the AI-era equivalent of a sitemap — not required, but a meaningful signal that separates brands AI can describe accurately from brands it guesses about.

The Problem llms.txt Solves

When ChatGPT, Perplexity, or Gemini is asked "what does [your brand] do?" or "best [category] for [use case]", the model constructs an answer from whatever it can find: your homepage, press coverage, G2 reviews, Reddit threads, and any third-party articles that mention you.

The result is often accurate enough to be believable, wrong enough to cost you deals. Wrong pricing. Outdated positioning. Missing use cases. Competitor confusion.

llms.txt gives you a seat at the table. Instead of AI reverse-engineering your brand from scattered signals, you hand it a curated brief.

What Exactly Is llms.txt?

llms.txt is a plain-text file hosted at the root of your domain. It was proposed in 2024 by Jeremy Howard (fast.ai) and has since been adopted by hundreds of developer tools, SaaS companies, and B2B brands.

The format is simple Markdown. There's no rigid schema — the goal is human-readable, AI-parseable content that tells models:

  • What your company does
  • Who it's for
  • What your key products or services are
  • Which pages on your site have the most useful information
  • How you differ from competitors

Think of it as a one-page brief for an AI that's about to answer questions about you.

How Is It Different from robots.txt?

robots.txt is a directive — it tells crawlers what they're allowed or not allowed to access.

llms.txt is a guide — it tells AI models what to understand about you, not just what to crawl.

They serve different purposes and you need both.

How Is It Different from a Sitemap?

A sitemap lists every URL on your site. llms.txt is curated — it highlights the pages that matter most for AI understanding, with context about why each one is important.

How AI Models Actually Use llms.txt

Not every AI model reads llms.txt on every query. Here's what actually happens:

Training Data

If your llms.txt file exists when a model's training data is collected, its content can be incorporated into the model's weights — directly shaping how the model "understands" your brand, not just how it cites you.

Live Retrieval (RAG)

Perplexity and ChatGPT with browsing enabled actively crawl pages when answering queries. A well-structured llms.txt is a highly efficient page for a model to retrieve — it gets more signal per token than most homepages.

Jina, Firecrawl, and AI Scraper Tools

Many AI agents and data pipelines use tools like Jina Reader to crawl pages before feeding content to an LLM. llms.txt is specifically designed for this — easy to parse, dense with useful information.

What to Include in Your llms.txt

Required Sections

  1. Company overview — one paragraph, plain language, no marketing fluff
  2. What you do — specific products or services, not vague categories
  3. Who it's for — customer segment, company size, use case
  4. Key pages — with brief descriptions of what each page covers

Optional But Valuable

  • Differentiators — 3-5 specific ways you differ from alternatives
  • Pricing summary — even rough ranges help AI models recommend you accurately
  • Not suitable for — honest scope limits build trust with AI models
  • Recent updates — signals freshness to live-retrieval models

A Ready-to-Deploy Template

# [Your Company Name]

> [One-sentence description of what you do and who you do it for]

[2-3 sentences expanding on the above. What problem do you solve?
What makes your approach different? What outcome do customers get?]

## Products / Services

- **[Product/Service Name]**: [One-sentence description]
- **[Product/Service Name]**: [One-sentence description]
- **[Product/Service Name]**: [One-sentence description]

## Who It's For

- [Customer type 1] — [why it fits them]
- [Customer type 2] — [why it fits them]
- [Company size / geography / vertical if relevant]

## Key Pages

- [Homepage](https://yourdomain.com): Overview of all products and pricing
- [About](https://yourdomain.com/about): Company background, team, mission
- [Pricing](https://yourdomain.com/pricing): Current plans and pricing
- [Blog](https://yourdomain.com/blog): Guides and research on [your topic]

## How We're Different from [Top Competitor]

[2-3 specific, factual differentiators. Avoid superlatives.]

## Pricing

[Rough pricing range or tiers. E.g. "Plans from $X/month. Free trial available."]

## Contact

- Website: https://yourdomain.com
- Email: hello@yourdomain.com
- Founded: [Year] | Based in: [Location]

A Real-World Example (Decyde's Own llms.txt)

# Decyde

> Decyde monitors how B2B brands appear in ChatGPT, Perplexity,
Gemini, and Google AI Overviews — and tells you exactly what
to do to rank higher in AI recommendations.

Decyde is an AI visibility intelligence platform for B2B brands,
agencies, and professional services firms. It tracks brand mentions
across all major AI platforms weekly, benchmarks against competitors,
maps citation sources, and delivers a prioritised weekly action plan.

## Products

- **AI Visibility Audit**: One-time scan across ChatGPT, Perplexity,
  Gemini, and Google AI Overviews. Produces an AI Visibility Score,
  citation map, competitor benchmark, and Fix It Files.
- **Decyde Pro**: Weekly monitoring subscription. Continuous tracking,
  weekly AEO action playbook, social listening, and Telegram alerts
  on score drops.
- **Fix It Files**: Auto-generated JSON-LD schema and llms.txt file
  specific to your brand, ready to deploy.

## Who It's For

- B2B SaaS companies monitoring competitor presence in AI answers
- Marketing agencies tracking client AI visibility
- Professional services firms (law, consulting, accounting) invisible
  in ChatGPT despite strong Google rankings

## Key Pages

- Homepage: https://decyde.co.uk — Free audit and product overview
- Blog: https://decyde.co.uk/blog — AEO guides and research
- Pricing: https://decyde.co.uk/#pricing — Pro plan details

## Pricing

- Pro: $99/month. Cancel anytime. Free audit with no credit card required.

## Contact

- Website: https://decyde.co.uk
- Email: hello@decyde.co.uk

How to Deploy Your llms.txt

Step 1: Create the File

Write your llms.txt using the template above. Keep it under 2,000 words — dense but scannable.

Step 2: Host It at the Root

The file must be accessible at https://yourdomain.com/llms.txt. For Next.js, place it in your /public directory. For WordPress, upload it via FTP or a file manager plugin. For Webflow, use the file manager.

Step 3: Verify It's Live

Visit https://yourdomain.com/llms.txt in your browser. You should see plain text — not a 404 or a download prompt.

Step 4: Allow AI Crawlers

Check your robots.txt doesn't block the crawlers that read llms.txt. Key bots to allow: GPTBot, PerplexityBot, CCBot, Google-Extended, anthropic-ai.

Step 5: Update It Quarterly

A stale llms.txt is worse than none — live-retrieval models will pick up outdated pricing or discontinued products. Add a "Last updated" line at the top and review it every quarter.

Common Mistakes

Mistake #1: Marketing copy instead of factual description

AI models respond better to factual, specific language than to marketing language. "We help companies dominate AI search" tells a model nothing. "Weekly citation tracking across ChatGPT, Perplexity, and Gemini" tells it exactly what you do.

Mistake #2: No key pages section

The key pages section is the highest-value part of llms.txt for live-retrieval models. If you skip it, models can't navigate to the right content on your site.

Mistake #3: Setting it and forgetting it

If you change pricing, launch a new product, or rebrand, update your llms.txt the same day. Out-of-date llms.txt files actively mislead AI models.

Mistake #4: Blocking AI crawlers in robots.txt

A surprising number of sites block GPTBot and PerplexityBot — often inherited from old robots.txt templates. Check yours before deploying llms.txt.

Does llms.txt Actually Work?

Direct attribution is hard — there's no "llms.txt Traffic" report in Google Analytics. But three signals suggest it does:

  1. Accuracy improvement: After deploying llms.txt, brands report AI models describing their product more accurately — correct pricing, right use cases, fewer hallucinations.
  2. Perplexity citations: Perplexity actively crawls llms.txt files. Brands with a well-structured llms.txt see it appear as a cited source in relevant answers.
  3. Training data inclusion: Datasets like CommonCrawl (which trains many LLMs) actively indexes llms.txt files. A well-written llms.txt is a direct path into future training runs.

The honest take: llms.txt is a low-effort, zero-downside signal. It takes 30 minutes to write, costs nothing to deploy, and improves AI model accuracy about your brand. The question isn't whether to do it — it's why you haven't already.

Beyond llms.txt: The Full AEO Stack

llms.txt is one piece. The full stack for AI visibility:

  • llms.txt — brand brief for AI models
  • JSON-LD schema — structured data that defines your entity in machine-readable format
  • FAQ content — answers the buyer questions AI models are asked
  • Third-party citations — G2 reviews, press mentions, comparison articles
  • Weekly monitoring — tracking whether changes are working

If you want to see your current AI Visibility Score and get a pre-built llms.txt specific to your brand, run a free audit below.

Get Your llms.txt Generated Automatically

Run a free Decyde audit and we'll generate a ready-to-deploy llms.txt and JSON-LD schema file specific to your brand. Takes 60 seconds. No credit card required.

Generate My llms.txt →