Beyond Static: How We Built Dynamic llms.txt for Real-Time News
TL;DR
The llms.txt standard was designed for static documentation sites. We needed it to work for a news feed publishing 10+ press releases daily. So we built a two-tier architecture with scope-specific variants, real-time YAML metadata, and database-driven generation. This is the story of how we extended the spec for dynamic content.
The Problem: Static Files for Dynamic Content
If you've worked with llms.txt, you know the standard pattern: create a static file at /llms.txt, describe your site's structure, and forget about it. This works beautifully for documentation sites, API references, and other relatively stable content.
But what about news sites?
Pressonify.ai publishes press releases in real-time. A company announces a product launch at 9am, and by 9:02am, that press release is live, indexed, and ready for AI systems to discover. Static files can't keep pace with this reality.
"The llms.txt spec was designed for docs sites, not newsrooms. We needed to rethink the implementation from the ground up."
The State of llms.txt Adoption
As of early 2026, approximately 600+ sites have implemented llms.txt. The vast majority are static implementations—"set and forget" files that describe unchanging content. This works well for:
- API documentation
- Product manuals
- Corporate websites
- Educational resources
But it breaks down for:
- News publications
- Press release platforms
- E-commerce with dynamic inventory
- Any site with real-time content updates
Our Solution: Two-Tier Architecture
After experimenting with several approaches, we settled on a two-tier architecture that separates platform documentation from dynamic news content:
Tier 1: /llms.txt (Hybrid)
The root llms.txt file contains:
- Platform overview and capabilities
- Static documentation links
- Feature descriptions
- Plus: A dynamically-generated section with recent press releases
This hybrid approach gives AI systems both stable context about what Pressonify is and fresh content about what's being published.
Tier 2: /news/llms.txt (Pure Dynamic)
This is where the innovation happens. The /news/llms.txt endpoint is 100% database-driven:
- Generated at request time
- Contains only news content (no platform docs)
- Scoped specifically for AI systems crawling news
- Real-time metadata computed per request
This scope-specific variant is something we haven't seen implemented elsewhere in the llms.txt ecosystem.
Real-Time YAML Metadata
One of our key innovations is extending the YAML frontmatter to include computed fields:
---
version: 2.9.5
lastModified: 2026-01-03T22:11:58Z # ← Computed at request time
totalArticles: 247 # ← Database count
scope: news-content-only # ← Semantic scoping
updateFrequency: realtime # ← Crawler hint
---
Why This Matters
Traditional llms.txt files have static metadata. The lastModified field (if present at all) is typically the file's modification date on disk. For dynamic content, this is meaningless.
Our computed metadata tells AI systems:
- Exactly when the content was generated (not when a file was last edited)
- How much content exists (for pagination and crawl planning)
- What scope this document covers (news only vs. full site)
- How often to return (realtime, hourly, daily)
This creates a self-describing file that AI systems can use to make intelligent caching and crawling decisions.
Full Stack Integration
Dynamic llms.txt isn't just about generating content at request time. It's about integrating the endpoint into a complete AI Discovery Protocol (ADP) infrastructure.
ADP Headers
Every response from our llms.txt endpoints includes:
| Header | Purpose | Example |
|---|---|---|
ETag |
Cache validation | W/"a3f8b2c1d4e5" |
Content-Digest |
Integrity verification | sha-256=:base64hash: |
X-Update-Frequency |
Crawl scheduling hint | realtime |
Cache-Control |
Browser/CDN caching | public, max-age=300 |
Access-Control-Allow-Origin |
CORS for AI tools | * |
These headers enable AI systems to:
- Skip re-fetching unchanged content (ETag)
- Verify content integrity (Content-Digest)
- Schedule optimal crawl intervals (X-Update-Frequency)
- Access content from browser-based AI tools (CORS)
Database Integration
The endpoint queries Supabase in real-time:
Request → FastAPI → Supabase Query → Format Content → Generate Headers → Response
No caching layer. No stale data. Every request returns the current state of the database.
The Innovation Spectrum
Let's be honest about where this sits on the innovation spectrum. This isn't a revolution—it's a thoughtful evolution of an emerging standard.
| Aspect | What Exists | What We Built | Innovation |
|---|---|---|---|
| Static llms.txt | ~600 sites | Baseline | — |
| Dynamic content appending | Rare (ReadMe does similar) | Hybrid approach | +1 |
| Scope-specific variants | Novel | /news/llms.txt |
+2 |
| Real-time YAML metadata | Novel | Computed fields | +1.5 |
| ADP header integration | Very rare | Full implementation | +1 |
| Database-driven generation | Uncommon | 100% computed | +1 |
Overall: 7.5/10 — "Thoughtful evolution, not revolution"
We're not claiming to have invented something radically new. We've taken an emerging standard and adapted it for a use case the original spec didn't anticipate: real-time news content.
What This Means for AI Discovery
The practical benefits of dynamic llms.txt:
1. Fresh Context, Always
AI crawlers hitting /news/llms.txt always get the current state of our news feed. No stale data from a forgotten static file.
2. Scoped Context for Better Relevance
When ChatGPT or Perplexity wants to understand our news content specifically, they can hit /news/llms.txt instead of wading through platform documentation in the root file.
3. Intelligent Caching
The X-Update-Frequency: realtime header tells crawlers to check back frequently. Combined with ETag support, crawlers can efficiently poll without re-downloading unchanged content.
4. Foundation for Future Features
We're planning to extend this further:
- Category filtering:
/news/llms.txt?category=technology - Company-specific feeds:
/company/{name}/llms.txt - Time-windowed exports:
/news/llms.txt?since=2026-01-01
The dynamic architecture makes these extensions straightforward.
The Strongest Claim We Can Make
"Pressonify is the first press release platform with a fully dynamic, database-driven llms.txt implementation featuring scope-specific variants and real-time metadata."
This claim is:
- True — Verifiable by checking competitors (PR Newswire, BusinessWire, PRWeb all return 404 for
/llms.txt) - Specific — Not vague marketing speak
- Differentiated — Competitors have static implementations or none at all
- Technical — Appeals to the audience that cares about this
Newsworthiness: Who Should Care?
| Audience | Relevant? | Best Angle |
|---|---|---|
| llms.txt community | Yes | "Extending the spec for dynamic content" |
| Technical SEO | Yes | "How to make llms.txt work for news sites" |
| AI/ML developers | Yes | "Building AI-native content infrastructure" |
| PR industry | Maybe | "First PR platform with dynamic AI discovery" |
| Mainstream tech press | Not alone | Only as part of larger "AI search" story |
| Hacker News | Yes | Technical implementation details |
What's Next
In Part 2 of this series, we dive into the technical implementation:
- Complete FastAPI code examples
- ADP header generation functions
- Database query optimization
- A proposed spec extension for scope-specific llms.txt files
Whether you're building something similar or just curious about the architecture, Part 2 has the code you need.
Read Part 2: Technical Implementation & Spec Proposal →
Resources
- llms.txt Specification — The original spec
- Pressonify llms.txt — Our hybrid implementation
- Pressonify News llms.txt — Our dynamic implementation
- AI Discovery Protocol v2.1 — Our broader ADP infrastructure
This is Part 1 of a 2-part series on Dynamic llms.txt. Part 2 covers technical implementation and our spec extension proposal.