AI Content Permissions & Policies: Control AI Use, Training & Citation

Why this matters

AI systems don't just read your content — they can summarize it, train on it, or cite it without attribution. Legible gives you granular control over these behaviors, and enforces your choices at every layer: HTTP headers, Markdown frontmatter, llms.txt, and llms-full.txt.

What you can control

Legible lets you set four AI permission types, independently, at the site level, per content type, or per individual page:

AI Use: Can AI systems use this content as input when generating responses? Set to 'allowed' or 'not allowed.'
AI Training: Can this content be used to train or fine-tune AI models? Most publishers set this to 'not allowed.'
AI Summaries: Can AI systems summarize this content in their responses?
AI Citation: Should AI systems attribute and link back to the original when referencing this content? Options: required, optional, or not applicable.

How permissions are enforced

Setting a policy in the Legible dashboard isn't just a label — it's enforced at every layer of delivery:

Content-Signal HTTP header: Every Markdown response includes a Content-Signal header declaring your permissions (e.g., 'ai-train=no, search=yes, ai-input=yes').
YAML frontmatter: Each page's Markdown includes machine-readable policy fields in the frontmatter that AI crawlers can parse.
llms-full.txt: Pages with AI use set to 'not allowed' are excluded from the full knowledge model. Only the title and URL appear as a reference.
robots.txt: Legible can generate or merge crawler-specific rules to block specific AI crawlers from accessing your content entirely.
X-Robots-Tag headers: Per-bot indexing directives are emitted on HTML responses (e.g., 'X-Robots-Tag: GPTBot: noindex') to control page-level indexing for individual crawlers.
Meta robots injection: Optionally, Legible can inject `<meta name="robots">` tags directly into your HTML for compatibility with on-page SEO audit tools.

Setting permissions in the dashboard

Go to your site's Settings to set site-wide defaults. These apply to all content unless overridden. To set different permissions for specific content types (for example, allowing AI summaries of blog posts but not legal pages), use the Content Library and filter by content type.

Individual pages can override both site and content-type settings. This is useful for high-value pages that need tighter controls.

Content license

In addition to AI-specific permissions, you can declare a content license (e.g., All Rights Reserved, CC-BY-NC-4.0, MIT). This is included in the Markdown frontmatter and llms.txt so AI systems — and the humans reviewing AI outputs — know the legal terms.