What is llms.txt? The new standard AI search is already reading

llms.txt_ the new standard AI search is already reading

AI crawlers are already visiting your website, reading whatever they find, and deciding what your brand means to them. llms.txt is a small plain-text file that sits at the root of your domain and tells those crawlers what you actually want them to understand.

If robots.txt was a no-entry sign, llms.txt is a guided tour: you are not blocking the AI, you are briefing it before it draws its own conclusions.

What llms.txt actually is

llms.txt is a proposed open standard, introduced by Answer.AI and Jeremy Howard in 2024, that gives large language models (LLMs, the AI systems behind tools like ChatGPT, Claude, and Perplexity) a structured, human-readable summary of what a website contains and how it should be understood.

Think of it as a briefing document you leave on the front desk before an important visitor arrives. The visitor, in this case an AI crawler, can technically wander through every room on their own. But if you hand them a one-page summary of what matters and what does not, the tour goes better for everyone.

The file lives at yourdomain.com/llms.txt and is written in plain Markdown. It typically includes a short description of the site, links to key pages, and optional context about what the site does and who it serves. The llms.txt specification is publicly available and still evolving, which is worth keeping in mind before you treat it as a fixed law.

It is not a ranking signal in the traditional SEO sense. It does not tell Google where to put you in the blue-link results. What it does is give AI systems a cleaner, more accurate picture of your content before they summarise, cite, or ignore you entirely.

How llms.txt differs from robots.txt

Robots.txt, the file that has existed since 1994, tells crawlers which parts of your site they are allowed to access. It is a permission layer. llms.txt is something different: it is a context layer.

Robots.txt says “you may not enter room 3.” llms.txt says “here is what this building is for, here are the most important rooms, and here is a short explanation of why they matter.” One controls access, the other shapes understanding.

This distinction matters because AI crawlers do not always behave like traditional search crawlers. Some respect robots.txt directives. Some do not, or interpret them loosely. And even the ones that do respect access rules still need to make sense of what they find once they are inside. llms.txt addresses the second problem, not the first.

There is also a practical difference in who writes them. Robots.txt is usually a developer task, written once and rarely revisited. llms.txt is closer to a content strategy task: it requires someone to think about what the site is actually trying to communicate, which pages carry the most weight, and how an AI should frame the brand in a summary.

Decision box

Best if: your site has substantial original content, a clear service or product focus, and you want AI systems to cite you accurately rather than summarise you loosely.
Not ideal if: your site is a single landing page with minimal content depth, where an llms.txt file would add almost nothing an AI could not already infer.
Likely overkill when: you are still building the core content and have not yet defined what your site is actually about. Fix the content first.

Two colleagues at a Zwolle wooden desk comparing robots.txt and llms.txt side by side on two screens

Which AI systems read llms.txt

Adoption is real but uneven, which is the honest answer anyone giving you a straight take should lead with.

Perplexity has publicly stated support for the llms.txt standard. Several AI-powered search tools and retrieval systems have begun incorporating it into their crawling logic. Anthropic’s ClaudeBot and OpenAI’s GPTBot both crawl the open web independently, and while neither has made a formal statement about llms.txt specifically, the file is indexed and readable by any crawler that visits the root of your domain.

The more useful framing is this: llms.txt does not need universal adoption to be worth having. If even one major AI system uses it to understand your site more accurately, that is a better outcome than leaving every AI to guess. The cost of adding the file is low. The cost of being misrepresented in an AI-generated summary is harder to quantify but very real.

What changes when more AI systems adopt it is that the file becomes a competitive signal. Sites that have invested in a well-structured llms.txt will have a cleaner presence in AI-generated answers. Sites that have not will be summarised based on whatever the crawler happened to find on its last visit, which may or may not reflect what the site actually does well.

What goes inside an llms.txt file

The structure is intentionally simple, which is either reassuring or suspicious depending on how much you have been burned by “simple” standards before.

A basic llms.txt file contains a title, a short description of the site or organisation, and a list of links to the most important pages with brief annotations explaining what each one covers. Optional sections can include information about the site’s purpose, its audience, and any content that should be treated with particular weight.

The file is written in Markdown, which means no special tooling is required. A text editor and a clear idea of what your site is for are the only prerequisites. The challenge is not technical. The challenge is the same one that makes writing a good meta description hard: you have to decide what actually matters, in plain language, without hedging.

One pattern that works well in practice is to treat the llms.txt file the way you would treat a briefing document for a new team member who needs to understand your site in ten minutes. What are the three things they must know? Which pages would you send them to first? What would you not want them to misunderstand? Answer those questions and you have a solid llms.txt draft.

Why llms.txt matters for AI search visibility

The shift toward AI-generated answers in search is not a future scenario. Google AI Overviews, Perplexity, ChatGPT search, and Bing Copilot are already serving summarised answers to millions of queries daily. In that environment, being cited accurately is more valuable than ranking in position four.

The problem is that AI systems summarise based on what they can parse, not necessarily what you intended. A site with strong content but poor structural clarity can be summarised in ways that miss the point entirely. A competitor with a cleaner content architecture and an llms.txt file may get cited in your place, not because their content is better, but because it was easier for the AI to understand.

This is where AI search visibility becomes a practical concern rather than a theoretical one. The sites that show up in AI-generated answers are not always the most authoritative. They are often the most legible to the model doing the summarising.

llms.txt is one part of a broader approach to making your site legible to AI systems. It works alongside clear page structure, well-written headings, and content that answers specific questions directly. None of these elements is a magic switch. Together, they reduce the gap between what your site says and what an AI reports that it says.

Person at a Zwolle desk comparing an AI search summary with the llms.txt file on screen

How to add llms.txt to your site today

The technical barrier is genuinely low, which is one of the few times that sentence is actually true rather than a way of glossing over something annoying.

Create a plain text file named llms.txt. Write it in Markdown. Start with a level-one heading that names your site or organisation. Add a short paragraph describing what the site does and who it is for. Then list your most important pages as Markdown links with a one-sentence annotation for each. Upload the file to the root of your domain so it is accessible at yourdomain.com/llms.txt. That is the complete process.

For WordPress sites, you can place the file in the root directory via FTP or a file manager, or use a plugin that handles static file placement. For other CMS platforms, the approach is the same: the file needs to live at the root, not in a subdirectory.

The harder part is keeping it current. An llms.txt file that describes a version of your site from eighteen months ago is worse than no file at all, because it actively misleads the AI about what you currently offer. Treat it like a living document, not a one-time task.

Working with an AI SEO agency can help if the content strategy side of this is where things stall. Deciding which pages to surface, how to frame the site’s purpose, and how to keep the file aligned with content changes is the part that takes judgment, not just technical execution.

What to monitor monthly

Check that yourdomain.com/llms.txt is accessible and returns a 200 status, not a redirect or 404.
Review whether the pages listed in the file still exist and still represent your best content.
Search for your brand name in Perplexity and ChatGPT search to see how you are being described in AI-generated answers.
Note any gaps between what your llms.txt says about your site and how AI tools are actually summarising it.
Update the file whenever you publish significant new content or retire old pages that were listed.

Developer at a Zwolle desk checking the llms.txt file while a browser tab shows an AI search result

llms.txt is a plain-text file placed at the root of a website that gives AI crawlers, including those used by Perplexity, ChatGPT search, and Claude, a structured summary of what the site contains and how it should be understood. Proposed by Answer.AI and Jeremy Howard in 2024, it differs from robots.txt by shaping AI comprehension rather than controlling access. As AI-generated answers replace traditional search results for a growing share of queries, llms.txt is becoming a practical tool for accurate AI citation. Studio Ubique works with clients on AI search visibility as part of broader content and SEO strategy.

FAQs

Is llms.txt an official standard?

It is a proposed open standard introduced by Answer.AI in 2024, not an official W3C or IETF specification. Adoption is growing but voluntary, and the specification is still being refined, so treat it as an emerging best practice rather than a fixed protocol.

Does llms.txt affect my Google rankings?

No, llms.txt has no direct effect on traditional Google search rankings. It is aimed at AI systems that generate summarised answers, not at the crawlers that determine blue-link position. Your robots.txt and sitemap.xml remain the relevant files for classic SEO purposes.

Can i use llms.txt to block AI crawlers?

No. llms.txt is a guidance file, not an access control file. If you want to restrict AI crawlers, you need to use robots.txt directives or, for more aggressive bots, server-level blocking. llms.txt tells AI systems what to understand, not whether they are allowed in.

How long should an llms.txt file be?

There is no enforced length limit, but shorter is generally better. A file that covers your five to ten most important pages with clear annotations is more useful than an exhaustive list of every URL on the site. The goal is to help an AI understand your site quickly, not to replicate your sitemap.

Do small business websites need llms.txt?

If the site has meaningful content and you care about being cited accurately in AI-generated answers, yes. The effort is small and the file is easy to maintain. If the site is a single-page brochure with no real content depth, the practical benefit is minimal.

Lennart de Ridder

Let's talk

If AI systems are already summarising your competitors and you are not sure what they are saying about you, that is a reasonable thing to look at before it becomes a problem.

Schedule a free 30-minute discovery call: Book a call

Book a call

Back to insights