Markdown for Agents on SvelteKit + Cloudflare Workers
Stay on top of this story
Follow the names and topics behind it.
Add this story's key topics to your watchlist so LyscoNews can highlight related developments and future matches.
Create a free account to sync your watchlist, saved stories, and alerts across devices.
Quick Summary
The AI crawl-and-summarize wave has landed. Google's Gemini, OpenAI's GPT, Anthropic's Claude, Perplexity — they're all hitting your site. If you serve them HTML, they waste tokens parsing `` soup. If you serve them markdown, they get clean context immediately. Cloudflare published a guide to serving markdown for AI agents using Transform Rules and their AI gateway. Michael Wolson then wrote an excellent adaptation for the free plan using Transform Rules to set headers that content-negotiate at the edge. I run SvelteKit on Cloudflare Workers. My situation is different — and simpler. Here's how I implemented it and why no Transform Rules were necessary. Wolson's approach is clever: use a Cloudflare Transform Rule to inject a custom header based on the Accept header, then check that header in your application to decide the response format. This solves a real problem for static sites: CDN caching. Without differentiated cache keys, a cached HTML response gets served to a markdown-requesting bot, or vice versa. Workers don't have this problem. Every request executes the Worker — there's no CDN cache layer sitting in front of dynamic routes on Pages Functions. The Worker sees the raw Accept header directly and can branch on it before any rendering happens. The implementation hooks into SvelteKit's server hooks — the middleware layer that runs before any route handler. `mermaid
The key insight: all my content already lives in the API with raw markdown fields. Posts have content (markdown). Articles have content (markdown). Pages have content (markdown). No HTML-to-markdown conversion needed — I just skip the rendering step entirely. Two signals trigger markdown responses: The Accept: text/markdown header (per the emerging convention) A ?format=md query parameter (for easy browser testing) `typescript
The Hook
In hooks.server.ts, the markdown check runs after legacy redirects but before resolve(event) — meaning SvelteKit never renders any Svelte components for markdown requests: `typescript
Routes without a markdown handler (like /security or /tweet-archive) fall through to normal SSR. No 406 errors, no broken pages. Each route is a regex pattern matched against the pathname, paired with a handler that fetches from the API and formats the response:
Route Data Source
/ Recent posts + presence status
/now Full /now page data
/posts Published micro posts (last 50)
/posts/:slug Single post — raw content field
/articles Published article list
/articles/:slug Single article — raw content field
/pages/:slug Single page — raw content field
Individual content pages return the markdown as-is with minimal front matter:
markdown Published: 2026-02-18 · Stream: tech URL: https://cogley.jp/articles/some-slug [raw markdown content from API]
List pages provide a structured index with truncated previews and URLs.
For agents that want everything at once, /llms-full.txt returns all articles, all pages, the now page, and the 50 most recent posts stitched into a single markdown document. This endpoint always returns markdown regardless of the Accept header — it's explicitly for machine consumption.
Combined with the existing /llms.txt (which describes the site structure and available sections), agents have a complete discovery path:
`mermaid
Token Estimation
Every markdown response includes an x-markdown-tokens header with a rough token count estimate (content.length / 4). It's not precise — real tokenization varies by model — but it gives agents a quick way to gauge response size before processing. The Cloudflare blog and Wolson's approach both solve a caching problem that Workers don't have:
Concern Static/CDN Sites Workers/Pages Functions
CDN cache collision Real problem — same URL, different Accept Non-issue — Worker always executes
Transform Rules needed Yes, to differentiate cache keys No
Vary: Accept Insufficient alone (CDN ignores it) Works correctly (no CDN layer)
Implementation Edge rules + origin logic Just origin logic
Setting Vary: Accept is still good practice for HTTP correctness, but it's not load-bearing the way it would be on a CDN-cached static site.
My profile site at rick.cogley.jp uses the same pattern. The profile sections only store content_html (not raw markdown), so the handler uses stripHtml() to extract clean text. Not as rich as raw markdown, but vastly better than HTML tags for an AI agent trying to understand who I am.
bash curl -s -H "Accept: text/markdown" https://cogley.jp/now curl -s "https://cogley.jp/now?format=md" curl -sI -H "Accept: text/markdown" https://cogley.jp/posts curl -s https://cogley.jp/llms-full.txt curl -s https://cogley.jp/now
If I were building this from scratch, I'd store all content as markdown and render HTML on demand — which is essentially what this site already does. The /api already has markdown fields everywhere because that's what the editor produces. The HTML rendering happens in SvelteKit route handlers using marked.
For sites where content is HTML-first (CMS output, rich text editors), you'd need an HTML-to-markdown conversion step. Libraries like turndown handle this, but the output won't be as clean as source markdown. If you're designing a new system, store the markdown.
Cloudflare: Markdown for bots
mwolson.org: Markdown for agents on Cloudflare free plan
llms.txt specification
Originally published at cogley.jp Rick Cogley is CEO of eSolia Inc., providing bilingual IT outsourcing and infrastructure services in Tokyo, Japan.