Introducing the AI Discovery Protocol: Making Websites Discoverable to AI Systems

← Back to Blog
We're open-sourcing the AI Discovery Protocol (ADP) — a new standard that makes websites discoverable to AI systems like ChatGPT, Claude, Perplexity, and Gemini. Unlike traditional SEO, ADP provides structured, machine-readable metadata specifically designed for AI reasoning engines.
Share:

Introducing the AI Discovery Protocol: Making Websites Discoverable to AI Systems

TL;DR

We're open-sourcing the AI Discovery Protocol (ADP) — a new standard that makes websites discoverable to AI systems like ChatGPT, Claude, Perplexity, and Gemini. Unlike traditional SEO (built for keyword-based search), ADP provides structured, machine-readable metadata specifically designed for AI reasoning engines.

The core innovation: A single entry point (/ai-discovery.json) that maps your entire AI-optimized content ecosystem, combined with versioned entity catalogs and incremental update support.

Released under MIT License — free for anyone to use, implement, and build upon.


The Problem: Traditional SEO Doesn't Work for AI

For 25 years, SEO has been optimized for Google's web crawlers:

  • Keyword matching → AI systems query structured entity catalogs
  • HTML pages → AI needs JSON-LD entities
  • Backlink analysis → AI reasons over entity relationships
  • PageRank heuristics → AI requires semantic context

The fundamental mismatch: Traditional SEO assumes keyword-based indexing. AI systems work with entity graphs and structured knowledge.

Real-World Example: Press Release Distribution

At Pressonify.ai, we publish professional press releases in 60 seconds using Claude Sonnet 4.5. But we noticed something: our customers' press releases were invisible to AI search engines.

Why? Because AI systems don't know:
- Where to find structured entity data
- How to detect content updates
- Which entities exist on the site
- How entities relate to each other

Traditional robots.txt blocks crawlers. sitemap.xml lists URLs. But nothing tells AI systems where to find machine-readable entity catalogs.


The Solution: AI Discovery Protocol

Core Architecture

website.com/
├── ai-discovery.json          # Meta-index (ENTRY POINT)
├── knowledge-graph.json       # Entity catalog
├── llms.txt                   # AI-readable context
└── robots.txt                 # Crawler directives

The Discovery Flow

AI System → GET /ai-discovery.json
          ↓
       Parse meta-index
          ↓
       ┌────────────┬──────────────┬─────────────┐
       ↓            ↓              ↓             ↓
  knowledge-   llms.txt      robots.txt    Other
  graph.json                               endpoints

Key insight: AI systems hit one canonical file first, then discover all other resources from the meta-index.


Why This Works: Design Principles

1. Single Entry Point

Unlike scattered Schema.org markup across 100 HTML pages, ADP provides one canonical file at /ai-discovery.json.

Before (Traditional SEO):

AI System  Crawls 100 HTML pages
           Parses embedded JSON-LD
           Reconstructs entity graph
           Misses 40% of entities

After (ADP):

AI System → GET /ai-discovery.json
          → Discovers knowledge-graph.json
          → Loads complete entity catalog
          → 100% entity coverage

2. Versioning & Change Detection

Traditional web crawlers re-index entire sites daily. Wasteful for large sites with infrequent updates.

ADP adds semantic versioning to knowledge graphs:

{
  "@type": "KnowledgeGraph",
  "version": "2.7.1",
  "generatedAt": "2025-11-02T11:30:00Z",
  "changeLog": {
    "changes": [
      {
        "timestamp": "2025-11-02T11:30:00Z",
        "entityType": "Product",
        "entityId": "https://site.com/products/widget-pro",
        "changeType": "updated",
        "modifiedFields": ["price", "availability"]
      }
    ]
  }
}

Result: AI systems can do incremental updates instead of full re-crawls.

3. Progressive Enhancement

Sites can implement ADP at three levels:

Level Files Required Implementation Time Benefits
Level 1: Minimal ai-discovery.json only 15 minutes Signals AI-friendly intent
Level 2: Standard + knowledge-graph.json + llms.txt 2-4 hours Full AI discoverability
Level 3: Advanced + Versioning + Change logs 1-2 days Incremental crawling support

No all-or-nothing requirement. Start small, scale incrementally.


How We Built This at Pressonify

The Journey

  1. January 2025: Implemented static llms.txt (Jeremy Howard's proposal)
  2. June 2025: Added Schema.org JSON-LD to press release pages
  3. October 2025: Built dynamic /knowledge-graph.json endpoint
  4. November 2025: Synthesized learnings into unified protocol

Production Stats (Pressonify.ai)

  • Entity count: 132 entities (NewsArticles, Organization, Products, FAQs)
  • Update frequency: Daily (automated from database)
  • Knowledge graph size: ~45KB (gzipped: 12KB)
  • Implementation time: 6 hours (with FastAPI backend)

The Breakthrough Moment

We tested whether Claude (via web search) could discover our press releases. Results:

Before ADP:
- ❌ Claude found 2 out of 27 press releases
- ❌ Couldn't identify company relationships
- ❌ No awareness of recent updates

After ADP:
- ✅ Claude found 27 out of 27 press releases
- ✅ Correctly identified entity relationships
- ✅ Surfaced most recent press releases first

The difference? A structured entity catalog instead of scattered HTML pages.


Technical Deep Dive

1. ai-discovery.json (Meta-Index)

Purpose: Single source of truth for all AI discovery resources.

Example:

{
  "$schema": "https://pressonify.ai/schemas/ai-discovery/v1.0.json",
  "version": "1.0.0",
  "generatedAt": "2025-11-02T12:00:00Z",
  "website": {
    "url": "https://example.com",
    "name": "Example Corporation",
    "description": "Leading provider of example products"
  },
  "endpoints": {
    "knowledgeGraph": {
      "url": "https://example.com/knowledge-graph.json",
      "format": "application/ld+json",
      "lastModified": "2025-11-02T11:30:00Z",
      "entityCount": 132,
      "version": "2.7.1"
    },
    "contextDocument": {
      "url": "https://example.com/llms.txt",
      "format": "text/markdown",
      "sections": ["Overview", "Products", "Press Releases"]
    }
  },
  "capabilities": {
    "supportsVersioning": true,
    "supportsIncrementalUpdates": true,
    "updateFrequency": "daily"
  }
}

Why this matters:
- AI systems request one file to discover everything
- Metadata shows freshness without parsing full knowledge graph
- Extensible: Add new endpoints as they're created


2. knowledge-graph.json (Entity Catalog)

Purpose: Complete catalog of all entities on the site using Schema.org vocabularies.

Example (truncated):

{
  "@context": "https://schema.org",
  "@type": "KnowledgeGraph",
  "version": "2.7.1",
  "generatedAt": "2025-11-02T11:30:00Z",
  "statistics": {
    "totalEntities": 132,
    "byType": {
      "Organization": 1,
      "Product": 45,
      "NewsArticle": 78,
      "Person": 5
    }
  },
  "@graph": [
    {
      "@type": "Organization",
      "@id": "https://example.com/#organization",
      "name": "Example Corporation",
      "url": "https://example.com",
      "sameAs": [
        "https://twitter.com/example",
        "https://linkedin.com/company/example"
      ]
    },
    {
      "@type": "NewsArticle",
      "@id": "https://example.com/news/product-launch",
      "headline": "Example Corp Launches Widget Pro",
      "datePublished": "2025-11-01T09:00:00Z",
      "author": {
        "@id": "https://example.com/#organization"
      },
      "url": "https://example.com/news/product-launch"
    }
  ]
}

Innovation: Change Detection

"changeLog": {
  "lastModified": "2025-11-02T11:30:00Z",
  "changes": [
    {
      "timestamp": "2025-11-02T11:30:00Z",
      "entityType": "Product",
      "entityId": "https://example.com/products/widget-pro",
      "changeType": "updated",
      "modifiedFields": ["price", "availability"]
    }
  ]
}

AI systems can check changeLog and only re-crawl updated entities.


3. llms.txt (Context Document)

Purpose: Human-readable Markdown providing context for AI systems.

Example:

---
version: 1.0.0
lastModified: 2025-11-01T10:00:00Z
---

# Example Corporation

> AI-Optimized Content for Large Language Models

## Overview

Example Corporation is a leading provider of professional widgets.
Founded in 2020, we serve over 10,000 customers worldwide.

## Products

### Widget Pro
- **Price:** $299 USD
- **Features:** Advanced automation, real-time analytics
- **Use Cases:** Enterprise widget management

## Recent Press Releases

### Product Launch: Widget Pro v2.0 (November 1, 2025)
Example Corp today announced Widget Pro v2.0 with AI-powered automation...

Read more: https://example.com/news/product-launch

## Contact

- **Website:** https://example.com
- **Email:** [email protected]

Why Markdown?
- AI systems process Markdown natively
- Human-readable (developers can edit without tools)
- Supports rich formatting (headings, lists, links)
- No HTML/CSS overhead


Real-World Use Cases

1. E-commerce (Shopify Stores)

Problem: Product catalogs are invisible to AI shopping assistants.

Solution: Generate knowledge-graph.json from Shopify GraphQL API:

{
  "@graph": [
    {
      "@type": "Product",
      "@id": "https://mystore.com/products/handmade-mug",
      "name": "Handmade Ceramic Mug",
      "description": "Artisan-crafted ceramic mug",
      "offers": {
        "@type": "Offer",
        "price": "24.99",
        "priceCurrency": "USD",
        "availability": "https://schema.org/InStock"
      }
    }
  ]
}

Result: When users ask ChatGPT "find handmade ceramic mugs," your store appears in results.


2. SaaS Companies

Problem: AI assistants can't recommend your product because they don't know it exists.

Solution: Structured entity catalog with product features:

{
  "@type": "SoftwareApplication",
  "@id": "https://myapp.com/#software",
  "name": "My SaaS App",
  "applicationCategory": "BusinessApplication",
  "offers": {
    "@type": "Offer",
    "price": "49",
    "priceCurrency": "USD"
  },
  "aggregateRating": {
    "@type": "AggregateRating",
    "ratingValue": "4.8",
    "reviewCount": "1250"
  }
}

Result: AI assistants can recommend your product when users search for solutions.


3. Publishers & Media Companies

Problem: 500 blog posts buried across multiple pages.

Solution: Single knowledge graph with all NewsArticle entities:

{
  "statistics": {
    "totalEntities": 500,
    "byType": {
      "NewsArticle": 450,
      "Person": 30,
      "Organization": 20
    }
  },
  "@graph": [
    {
      "@type": "NewsArticle",
      "@id": "https://techblog.com/ai-trends-2025",
      "headline": "Top AI Trends for 2025",
      "datePublished": "2025-11-01T09:00:00Z"
    }
  ]
}

Result: AI systems index entire content catalog from one file.


Implementation Guide

Minimal Implementation (15 Minutes)

Step 1: Create ai-discovery.json in your site's root directory:

{
  "version": "1.0.0",
  "generatedAt": "2025-11-02T12:00:00Z",
  "website": {
    "url": "https://yoursite.com",
    "name": "Your Company Name",
    "description": "Brief description of your company"
  }
}

Step 2: Upload to your web server:

# Static hosting
cp ai-discovery.json public/

# Or create as dynamic endpoint (FastAPI example)
@app.get("/ai-discovery.json")
async def ai_discovery():
    return {
        "version": "1.0.0",
        "generatedAt": datetime.utcnow().isoformat() + "Z",
        "website": {
            "url": "https://yoursite.com",
            "name": "Your Company",
            "description": "Description here"
        }
    }

Step 3: Verify:

curl https://yoursite.com/ai-discovery.json | jq

Done! You've implemented Level 1 (Minimal) ADP compliance.


Standard Implementation (2-4 Hours)

Step 1: Create knowledge-graph.json with your primary entities:

{
  "@context": "https://schema.org",
  "@type": "KnowledgeGraph",
  "version": "1.0.0",
  "generatedAt": "2025-11-02T12:00:00Z",
  "@graph": [
    {
      "@type": "Organization",
      "@id": "https://yoursite.com/#organization",
      "name": "Your Company",
      "url": "https://yoursite.com",
      "logo": "https://yoursite.com/logo.png"
    }
  ]
}

Step 2: Create llms.txt:

---
version: 1.0.0
lastModified: 2025-11-02T12:00:00Z
---

# Your Company Name

> AI-Optimized Content for Large Language Models

## Overview

[Brief company description]

## Products/Services

[List your main offerings]

## Recent Updates

[Latest news, product launches, etc.]

## Contact

- Website: https://yoursite.com
- Email: [email protected]

Step 3: Update ai-discovery.json to reference new files:

{
  "version": "1.0.0",
  "generatedAt": "2025-11-02T12:00:00Z",
  "website": {
    "url": "https://yoursite.com",
    "name": "Your Company"
  },
  "endpoints": {
    "knowledgeGraph": {
      "url": "https://yoursite.com/knowledge-graph.json",
      "format": "application/ld+json",
      "entityCount": 5,
      "version": "1.0.0"
    },
    "contextDocument": {
      "url": "https://yoursite.com/llms.txt",
      "format": "text/markdown"
    }
  }
}

Done! You've implemented Level 2 (Standard) ADP compliance.


WordPress Plugin (Coming Soon)

We're building a WordPress plugin to automate ADP implementation:

Features:
✅ Auto-generates ai-discovery.json
✅ Creates knowledge-graph.json from posts/pages
✅ Builds llms.txt from site content
✅ Automatic versioning and updates
✅ WooCommerce product integration

Launch date: December 2025
Price: Free (MIT License)


Shopify App Integration

Our PresSEO Shopify app now includes ADP support:

Features:
✅ Auto-generates knowledge graphs from products
✅ Syncs with Shopify GraphQL API
✅ Updates automatically on product changes
✅ Includes all collections, blogs, and pages

Available now: Shopify App Store


MCP Server (Model Context Protocol)

We're building an MCP server to provide ADP as a service:

npm install @pressonify/adp-mcp

Features:
- Host ADP files for any domain
- Real-time knowledge graph updates
- API for programmatic access
- CDN-backed for fast global delivery

Pricing:
- Free: Self-hosted open-source version
- Pro ($29/mo): Hosted MCP server + API
- Enterprise ($299/mo): Custom integrations + SLA

Launch: January 2026


Why We're Open-Sourcing This

The Network Effects Argument

Standards succeed when they're widely adopted, not when they're proprietary.

Historical precedents:
- HTTP (open) → Universal adoption
- RSS (open) → Billions of feeds
- robots.txt (open) → 25-year success
- Proprietary protocols → Dead

Our strategy:
1. Open-source the standard (MIT License)
2. Build commercial tooling (MCP server, plugins, APIs)
3. Become thought leaders in AI discovery
4. Profit from implementation services, not licensing

The Marketing Play

Releasing ADP as an open standard positions Pressonify as:
- Technical innovators in AI infrastructure
- Thought leaders in AI discovery architecture
- First movers in Answer Engine Optimization (AEO)

Expected outcomes:
- HackerNews front page (traffic spike)
- Conference speaking opportunities
- Partnership discussions with AI platforms
- Developer community contributions


Competitive Landscape

How ADP Compares

Feature ADP llms.txt Schema.org Only Google KG
Single entry point N/A
Entity catalog Partial Proprietary
Versioning
Change detection
Open standard
AI-optimized

Why Not Just Use llms.txt?

llms.txt limitations:
- No structured entity catalog
- No versioning or change detection
- Human-readable only (not machine-queryable)
- Major AI platforms don't support it yet

ADP is complementary: We include llms.txt as the context document layer.


Roadmap

Phase 1: Open Source Release (November 2025) ✅

  • [x] Specification document
  • [x] JSON Schema for validation
  • [x] GitHub repository
  • [x] Initial blog post

Phase 2: Tooling & Validation (December 2025)

  • [ ] WordPress plugin (free, MIT)
  • [ ] Online validation tool
  • [ ] JSON Schema validator
  • [ ] Documentation site

Phase 3: MCP Server (January 2026)

  • [ ] @pressonify/adp-mcp package
  • [ ] Hosted MCP service
  • [ ] API for programmatic access
  • [ ] CDN integration

Phase 4: Ecosystem Growth (Q1 2026)

  • [ ] Shopify app marketplace
  • [ ] Drupal/Joomla plugins
  • [ ] Static site generators (11ty, Hugo, Jekyll)
  • [ ] Framework integrations (Next.js, Nuxt, SvelteKit)

Phase 5: Standards Body Submission (Q2 2026)

  • [ ] W3C Community Group proposal
  • [ ] IETF RFC draft (similar to robots.txt RFC 9309)
  • [ ] Schema.org vocabulary extension proposal

Getting Involved

For Developers

Implement ADP on your site:
1. Read the specification
2. Create ai-discovery.json
3. Submit your site to our directory

Contribute to the standard:
1. GitHub repository
2. Discussion forum
3. Issue tracker

For AI Platform Teams

We're actively seeking partnerships with:
- OpenAI (ChatGPT, GPT-4)
- Anthropic (Claude)
- Google (Gemini, Search)
- Perplexity AI

Contact: [email protected]


Technical Resources

Documentation

  • Specification: GitHub
  • JSON Schema: https://pressonify.ai/schemas/ai-discovery/v1.0.json
  • Validator: https://pressonify.ai/tools/adp-validator (coming soon)

Example Implementations

  • Pressonify.ai: https://pressonify.ai/ai-discovery.json
  • E-commerce Demo: https://demo-store.pressonify.ai/ai-discovery.json
  • Blog Demo: https://demo-blog.pressonify.ai/ai-discovery.json

Code Examples


Frequently Asked Questions

Q: Is this different from Schema.org?

A: ADP uses Schema.org vocabularies but adds:
- Single entry point (ai-discovery.json)
- Versioning and change detection
- Coordinated discovery across multiple files
- AI-specific optimizations

Think of it as "Schema.org + discovery protocol."


Q: Will AI platforms actually use this?

A: We're in discussions with multiple AI platform teams. The value proposition is clear:
- Faster crawling: Single entry point vs 100 HTML pages
- Incremental updates: Change logs vs full re-crawls
- Better results: Structured entities vs keyword matching

Early feedback has been positive.


Q: Why not just improve llms.txt?

A: llms.txt is excellent for context documents. But:
- No structured entity catalog
- No versioning
- Human-readable only

ADP includes llms.txt as the context layer, then adds structured entity catalogs.


Q: How is this different from Google's Knowledge Graph?

A: Google's Knowledge Graph is proprietary and extraction-based (they build it from your site).

ADP is declarative — you tell AI systems exactly what entities exist and how they relate.


Q: What about privacy/security?

A: ADP files are public by design (like robots.txt). Best practices:
- Don't include sensitive data
- Don't expose internal URLs
- Implement rate limiting
- Use HTTPS

See Security Considerations in the spec.


Q: How do I validate my implementation?

A: We're building an online validator (December 2025). For now:

# Validate against JSON Schema
curl https://yoursite.com/ai-discovery.json | \
  jsonschema -i /dev/stdin \
  https://pressonify.ai/schemas/ai-discovery/v1.0.json

Conclusion

The AI Discovery Protocol represents a fundamental shift in how websites communicate with AI systems. By providing:

  1. Single entry point (ai-discovery.json)
  2. Structured entity catalogs (knowledge-graph.json)
  3. Human-readable context (llms.txt)
  4. Versioning and change detection

...we're building the infrastructure for Answer Engine Optimization (AEO) — the next evolution beyond traditional SEO.

This is an open standard (MIT License). We're not trying to own it; we're trying to start a movement.

If you believe websites should be discoverable to AI systems, join us.


Resources

  • Specification: https://github.com/BuddySpuds/AI-Discovery-Protocol
  • JSON Schema: https://pressonify.ai/schemas/ai-discovery/v1.0.json
  • Discussion Forum: https://github.com/BuddySpuds/AI-Discovery-Protocol/discussions
  • Contact: [email protected]
  • Twitter/X: @pressonify

Let's make the web discoverable to AI systems. Together.

— The Pressonify Team


Related Resources


Released under MIT License | November 2, 2025 | Version 1.0.0

👤

About Pressonify Team

The Pressonify Team builds AI-first press release infrastructure for the era of AI search.