Complete Guide to llms.txt

What llms.txt is, why it exists, how to use it in practice, and the limits and best practices to help agents and LLMs answer better about your site.

Origins, purpose and status

  • llms.txt is an editorial proposal: a Markdown file at the root that provides a curated map of important content and how to interpret it.
  • It coexists with robots.txt and sitemap.xml: it doesn’t replace them; it adds a curatorial layer to reduce noise, ambiguity and context selection costs.
  • Adoption is heterogeneous: very useful for controlled agents/tools (IDE, chatbot, helpdesk), less as a universal “SEO signal”; not all AI crawlers fetch it systematically.

What it is and how to use it

  • Human- and machine-readable, structured enough for deterministic parsing.
  • Typically lives at https://yourdomain.tld/llms.txt.
  • Encourages clean Markdown mirrors of important pages (via .md suffix, including index.html.md for path without filename).

Typical operational flow

  1. Fetch /llms.txt
  2. Parse: title, summary, notes, sections with link lists
  3. Select links relevant to the user question
  4. Fetch the pointed content (ideally Markdown)
  5. Assemble context in the prompt or via RAG

llms-full.txt vs index

  • llms.txt as an index with links: lighter and navigable.
  • llms-full.txt as a full dump: immediate but potentially huge; often used with indexing and retrieval (RAG).

How to write an effective llms.txt

Content principles

  • Concise, clear language.
  • Short informative descriptions next to links.
  • Reduce ambiguity and unexplained jargon.
  • Empirical testing: expand links and verify answers against real content and policies.

What to include

  • Technical docs: Quickstart/Getting Started, API Reference, runnable examples, decision guides, compatibility/versioning.
  • Company/product site: About, Products/Services, Pricing, FAQ, Support/Contact, Security/Compliance, Privacy/Terms, returns/shipping policies.
  • Portfolio/personal: CV/bio, main projects, contacts, notable talks/publications.

How long it should be

  • No hard size limit; prefer a curated “index” over indiscriminate dumps.
  • If you need full content, separate it explicitly (llms-full.txt or section files) and consider RAG for large corpora.

Limits, safety and considerations

  • Not a “hard” control mechanism: systems may use or ignore it.
  • Mixed evidence of AI bots fetching it: useful, but don’t expect automatic discovery/traffic gains.
  • Governance: avoid non-public information; ensure stability/versioning of linked pages for consistency over time.

Related resources

  • See “Specifications” for structure, syntax and semantics.
  • See “Tools” for CLI, CMS integrations and generator/crawlers.