AI Discovery

Optimize AI Crawling: Last-Modified & ETag Headers for Efficiency

Boost AI crawler efficiency with Legible's automatic Last-Modified and ETag headers. Ensure faster fetches and 304 responses, improving your site's AI visibility and SEO.

5 min readUpdated 2026-03-23AI Discovery
Why this matters

AI crawlers re-fetch content frequently, but most sites don't tell them whether anything has changed. Without freshness headers, every crawl is a full re-download — wasting bandwidth and making your content look undated.

Legible now generates `Last-Modified` and `ETag` headers automatically on every Markdown response, using real content dates from your CMS. Crawlers can make conditional requests and get lightweight 304 responses when nothing has changed.

Why freshness matters for AI visibility

AI systems that generate answers from web content need to know whether their cached version is still current. Without freshness signals, a crawler has two options: re-download everything on every visit, or guess when content might have changed.

Neither is good. Re-downloading wastes bandwidth and slows crawling. Guessing leads to stale answers and missed updates. Freshness headers solve both problems by giving crawlers a reliable way to check for changes.

In Legible's GEO Readiness audit, missing freshness headers cost up to 3 points. Sites that serve both `Last-Modified` and `ETag` earn the full score, while sites with one or the other earn partial credit.

What Legible generates

Every Markdown response served by Legible now includes two freshness headers:

  • `Last-Modified`: Set to the content's actual publication or modification date. For CMS-sourced content, this comes from the article date or product modification timestamp. For cached content, it reflects when the cache was populated. For HTML-extracted content, it uses the extraction time.
  • `ETag`: A weak entity tag (`W/"..."`) computed from a fast hash of the Markdown content itself. The same content always produces the same ETag, and different content produces different ETags. This makes conditional requests reliable without requiring persistent state.

Conditional requests and 304 responses

Once a crawler has fetched your content and stored the `ETag` or `Last-Modified` value, subsequent requests can include `If-None-Match` or `If-Modified-Since` headers. If the content hasn't changed, Legible responds with a `304 Not Modified` — no body, minimal bandwidth.

This is especially valuable for AI crawlers like GPTBot and ClaudeBot that may re-crawl popular pages frequently. A 304 response is a fraction of the size of a full Markdown document, reducing load on both your infrastructure and the crawler.

# First request:
GET /blog/my-post HTTP/1.1
Accept: text/markdown

# Response:
HTTP/1.1 200 OK
Last-Modified: Wed, 18 Mar 2026 10:30:00 GMT
ETag: W/"a1b2c3d4"
Content-Type: text/markdown

# Subsequent request:
GET /blog/my-post HTTP/1.1
Accept: text/markdown
If-None-Match: W/"a1b2c3d4"

# Response (content unchanged):
HTTP/1.1 304 Not Modified

Where content dates come from

Legible uses the most accurate timestamp available, depending on how the content was sourced:

  • CMS API (articles): The article's published or last-modified date from WordPress, Webflow, or Drupal.
  • CMS API (products): The product's `dateModified` field from WooCommerce.
  • KV cache: The `cachedAt` timestamp from when the content was last written to edge cache.
  • HTML extraction: The current time, since no authoritative date is available from the origin.

How this improves your GEO Readiness score

The GEO Readiness audit checks for freshness headers on your page's origin response. If Legible is serving your AI-readable content, both headers are present automatically — earning the full 3 points in the freshness category.

For origin HTML pages (not served through Legible), the audit still checks whether your server provides these headers. If your origin doesn't emit them, the audit will flag it as a finding with guidance on how to add them.