What is this?
The robots.txt is a text file at the root of your website (/robots.txt) that tells search crawlers which areas they may index and which they may not.
Standardised in RFC 9309.
Complementing it: ai.txt and llms.txt are aimed specifically at AI crawlers
(ClaudeBot, GPTBot, Google-Extended, PerplexityBot). With them you signal whether your content
may be used as training material for language models - not legally binding yet, but respected
by serious providers so far.
When do I need it?
robots.txt is a must for any production website. Without it, search engines crawl everything - including internal paths, admin pages, staging environments. A few disallow lines save crawl budget and protect against accidental indexing.
ai.txt / llms.txt are recommended as soon as you publish content with IP value (texts, code, data) that you do not want in AI training. Practically effective with the major providers; against bad actors, only legal remedies help.