SEO & Agentic Web·10 min read

I was ready to call llms.txt snake oil — then Google put it in Lighthouse

TL;DR
  • llms.txt does nothing for AI Overviews or citations — Google Search said so, and server logs show agents almost never fetch it.
  • The real (future) value is as a map for browser agents, not a ranking lever — a bet the tooling layer is placing, not a current return.

Due to my role, I tend to receive a lot of audits — technical and otherwise. In almost all of them, agencies pitch with a particular kind of confidence the need to add llms.txt. One file, they say. Add it to your root and the models will finally see you. Citations up, visibility fixed, inference problem solved. A search-visibility panacea, packed into one sentence.

Needless to say — in my experience — the magic recipe hasn't been found yet. You might already know that, so why bother with this write-up?

Because as these llms.txt conversations kept evolving, the story turned out to be more interesting than either the hype or my first dismissal of it. That's when I had my 'a-ha' moment and started doubting my own conclusions. But please, stay with me — this is a long one, and there's more to unpack.

What the file actually is

llms.txt is a proposal. Jeremy Howard of fast.ai published it on September 3, 2024 — a Markdown file at the root of your site that hands a language model a curated, concise summary of what the site is and where its important content lives. The format is defined: an H1 with the site name, a blockquote summary, optional sections, and H2-delimited lists of links. Howard was explicit that it's meant for inference — the moment a user asks for help — not for training.

No standards body has adopted it. Not W3C, not IETF. It's a community convention with a GitHub repo and a Discord, and that's the whole of its authority.

That's the part most agencies skip when they propose the implementation. And there's no evidence of it being a search ranking signal, nor a GEO or agentic one. It is a map. Full stop.

The debunk, because it deserves one

If you're selling llms.txt as the way to get mentioned in AI answers, the evidence is not on your side.

Google's own search representatives said no, twice. First informally, over the course of 2025 — John Mueller compared the file to the keywords meta tag, the discredited relic every SEO over thirty remembers stuffing in the early 2000s before search engines learned to ignore it entirely. That comparison is not a compliment. Then, on May 15, 2026, Google made it official: its Search Central guide on "optimizing for generative AI features" explicitly states that site owners don't need llms.txt — or any other AI-specific tactic — to appear in AI Overviews or AI Mode.

The data backs the dismissal, and here's the part that should give even the agent-optimists pause. Over the past months several people have done the unglamorous work of actually reading server logs to see who fetches the file. The answer is almost nobody — and crucially, not the agents it was built for. Dries Buytaert, the founder of Drupal, looked across an entire hosting fleet and found roughly 5,000 llms.txt requests out of 400 million — 0.001%, and nearly all of them from SEO tools rather than AI systems. A 90-day controlled experiment by OtterlyAI saw 84 requests to the file out of more than 62,000 AI-bot visits: a tenth of a percent. A separate 191-day study across about 900 domains logged 1,227 requests in total — roughly six a day — and reported not a single genuine AI bot among them. And on my end, nearly the same: across some personal sites of mine and the ones inside my working perimeter, llms.txt traction is close to zero.

Adoption tells a different story. In March 2026 — after a year and a half of industry conversation — a study found 7.4% of Fortune 500 companies had the file. Some agencies clearly managed to sell it in regardless.

So when Semrush starts flagging a missing llms.txt as a site issue, and an agency turns that flag into a deliverable, what you're being sold is a fix for a problem the engines you care about aren't even reading. As an AI Overviews lever, it does nothing measurable. That part of the hype is dead on arrival.

I was ready to write exactly that post and move on.

The part that held me up

I'll be honest: I never believed in llms.txt. Not as a ranking trick, not as a citation lever, not as the thing an agency could bolt on to fix AI visibility.

What held me up was one narrow thread, and only one. Howard wrote from the start that the file is for inference — helping a model use a site at the moment someone asks. I'd been dismissing that through a search lens, because through a search lens it's useless. But "inference-time help" was never a search claim, and the question I'd skipped was which thing, exactly, is supposed to read this file at inference, and when.

In September 2024 that had no concrete answer. By March 2026 it did — and the answer wasn't a search engine. It was an agent. That's the only reason it kept my attention: not because I think llms.txt matters on its own, but because it might matter next to WebMCP.

This clicked when I looked at what Google has been shipping in Chrome. WebMCP — a proposed open standard Google is building with Microsoft through the W3C — lets a site expose structured JavaScript functions and annotated HTML forms as tools a browser-based agent can call directly. Instead of an agent squinting at screenshots and simulating clicks, the site says: here are my actions, call them by name. It reached an early preview in Chrome Canary earlier this year, and — during the recent Google I/O — the company confirmed a public origin trial in Chrome 149.

The Lighthouse move — where Google started contradicting itself

This is where it gets curious, because the next twist came from inside Google — one half of the company seeming to argue with the other.

In early May 2026, Google's Chrome team added an llms.txt check to Lighthouse, the auditing tool bundled into Chrome's developer tooling. Note the team: Chrome, not Search. The doc carries a last-updated date of May 5; the audit shipped in Lighthouse 13.3 a couple of days later and moved into the default config. Then, on May 15 — the same fortnight — Google's Search team published its guide telling site owners the file does nothing for AI Overviews and they don't need it. So within roughly ten days, one Google product built tooling that checks for the file, and another Google product told you to ignore it.

That's the contradiction worth writing about, and it tripped up half the people reacting on LinkedIn (myself included) because the two messages look irreconcilable until you notice they're aimed at different things. Chrome's Lighthouse doc files the check under a new category called "Agentic Browsing", next to its WebMCP audits, and opens by describing the file as one designed "for LLMs and AI agents" — adding that without it, agents "may spend more time crawling the site to understand its high-level structure and primary content." Search cares about ranking. Chrome cares about whether an agent can find its way around. Same file, two entirely different jobs, two Google teams talking past each other in public.

Now — here's the detail that matters. Despite the placement, Lighthouse does not treat a missing llms.txt as a failure. Read the audit's own description: it flags a page only if the server throws an error trying to fetch the file. If the file simply isn't there — a plain 404 — the audit is marked Not Applicable, because, in Google's own words, "providing the file is optional at the moment." So the audit doesn't catch the absence of the file. It catches a broken attempt at serving one. Even the team that built the check refused to make the file mandatory. Anyone telling you Google now flags llms.txt absence as critical hasn't read the one-page doc.

So the Lighthouse move isn't Google reversing itself on whether the file matters for search. It's Google's browser org quietly staking out a different position: that this file belongs to the agent layer — optional today, worth auditing for the day it isn't.

So what's next?

Don't add llms.txt to climb AI mentions or citations; refrain from demoting it to something completely useless, although other emerging proposals are likely to suit much better (but that's another story).

My honest interpretation of the facts: the file is likely to be repositioned soon.

And even if today's agents demonstrably don't fetch it, that doesn't mean they won't. If the file changes scope, describing the context of exactly those pages that matter to WebMCP, that's the moment its discoverability role will mutate into an orientation one, making this resource valuable.

So, to recap: Google added the Lighthouse audit. Anthropic recommends the file for agent-facing work, OpenAI references it in its Agents SDK and commerce protocol, and AI-assisted IDEs like Cursor consume it when present. That's the tooling layer placing a bet. It is a bet, not a current return — and anyone who tells you otherwise is, again, misreading the press releases instead of the server logs.

It's cheap to add now if you want, though by today's implementation it might not matter anytime soon. You can do it — but treat that half-day's work as optionality, not as a traffic lever you can expect to pay out this quarter.

WebMCP & llms.txt: is it a marriage?

I do see WebMCP becoming the norm rather than the exception. And assuming the scope change above, llms.txt could still earn a place in the web ecosystem.

Let's get concrete about how a site actually exposes its tools. A site registers its capabilities with the browser through a new API, navigator.modelContext. Two ways to do it:

  • The declarative way: The browser turns HTML annotations into something callable; a simple no-JavaScript path, mostly suitable for forms.
  • The imperative way: use JavaScript to call registerTool() to initiate a contract specifying a name, a description, a schema describing the parameters, and a function to run.

That second path is the powerful one — a tool is not necessarily a form to interact with. It can be any logic embedded in a page: read the cart, pull authenticated data, kick off a multi-step workflow. Forms are just the easy case. The real surface is "anything the site's own code can do" - named and exposed so an agent calls it on purpose instead of guessing.

Here's the part that matters. Those capabilities aren't advertised outward. Tools are registered under navigator.modelContext on a page that's already loaded, and the agent — through its extension — has to query that object to find them. It's pull, not push. And critically, the agent can only do that once it's already on the page. There's no manifest, no file, nothing that tells the outside world "this site has tools" before arrival. This is the hole in the spec llms.txt could help fill.

So WebMCP gives an agent hands. But it doesn't give it a map. The interaction is client-side, happening in the browser — and that's exactly where the gap opens. How does a local agent know the shape of your site? Where the docs are, what the product even is, which of those registered tools matter for the task at hand? WebMCP answers "what can I do on this page." It doesn't answer "where can I find this on the site."

That's the slot llms.txt could actually fit: orientation. The map you hand the agent before it reaches for the tools.