How do I allow Perplexity to crawl my site?

Allow two user agents: PerplexityBot (builds the search index, respects robots.txt) and Perplexity-User (fetches pages when a user asks — Perplexity's own docs say it generally ignores robots.txt). Also whitelist Perplexity's published IP ranges in your WAF/CDN, since some providers block AI crawlers by default.

Does llms.txt or schema markup help on Perplexity?

There is no official Perplexity support for llms.txt or schema.org as ranking signals. Independent testing suggests structured data is read as ordinary page text. Publishing llms.txt is cheap but has no documented citation benefit on Perplexity as of mid-2026.

How big is Perplexity in 2026?

The last company-confirmed figure is 780 million queries in May 2025 (~30M/day, growing ~20% month-over-month). Statcounter puts Perplexity at 7.91% of AI-chatbot referrer share in June 2026, versus ChatGPT's 76.87%. Larger 2026 volume figures circulating online are unverified extrapolations.

Can publishers earn money from Perplexity citations?

Yes — two programs: the Publishers' Program (ad-revenue share when partner content is cited, since 2024) and Comet Plus (launched October 2025; 80% of subscription revenue shared with publishers from an initial $42.5M pool, paid across visits, citations and agent actions). Publishers apply directly to Perplexity.

How to Get Cited by Perplexity in 2026: Sources, Signals & Steps

TL;DR

Perplexity is the most "SEO-like" of the four big AI answer engines — and the most transparent about citations, numbering every source inline.

Fact	Value (2026)	Source class
Citations ranking in Google top 10	28.6% — highest of any AI engine (avg 12%)	Ahrefs, 3.1M queries
YouTube + Reddit + Wikipedia share of citations	>50% combined (32.4% / 16.6% / 8.2%)	Ahrefs, June 2026
Confirmed query volume	780M queries in May 2025 (~30M/day) — last official figure	CEO, via Search Engine Land
AI-chatbot referrer share	7.91% (vs ChatGPT 76.87%)	Statcounter, June 2026
Consumer pricing	Free / Pro $20/mo / Max $200/mo	Official

Verdict: if your content already performs in Google, Perplexity is the easiest AI engine to get cited on — classic SEO carries over more here than anywhere else. The two extra jobs: make sure PerplexityBot isn't blocked at the CDN level, and build presence on the third-party platforms (YouTube, Reddit, LinkedIn, G2) that supply half its citations.

How Perplexity Sources Answers

Perplexity runs its own retrieval index and shows numbered inline citations ([1][2][3]) linked to a source panel. Two official crawlers matter (per Perplexity's own crawler docs):

PerplexityBot — builds the search index. Respects robots.txt. Explicitly "does not crawl for AI model training." Published IP ranges are available for allowlisting.
Perplexity-User — fetches a page live when a user's question requires it. Perplexity's docs state it "generally ignores robots.txt" because it acts on behalf of a user.

Practical consequence: being in the index is governed by robots.txt and your WAF; being fetched at answer time mostly isn't. Note that Cloudflare has blocked AI crawlers by default since mid-2025 and publicly accused Perplexity of stealth crawling (Perplexity disputes this) — if your site sits behind Cloudflare or similar, check that PerplexityBot isn't being silently blocked. That single misconfiguration removes you from consideration entirely.

What Actually Gets Cited (The Data)

The best independent dataset is Ahrefs' June 2026 study of 3.1 million US queries:

Classic SEO transfers. 28.6% of Perplexity's citations rank in Google's top 10 — the highest overlap of any AI engine tested (the cross-platform average is 12%). If you rank, you're already in the candidate pool.
Off-domain platforms take half the pie. YouTube (32.4%), Reddit (16.6%) and Wikipedia (8.2%) together exceed 50% of all citations. Your own domain competes for what's left.
B2B skews further to third-party surfaces. Peec AI's analysis (via Search Engine Land, March 2026) found Perplexity over-cites Reddit, LinkedIn and G2 on B2B queries.
The citation mix is stable. Semrush's 230k-prompt study found Perplexity's most-cited domains (Reddit, LinkedIn, NIH, Microsoft, Google) notably more consistent over time than ChatGPT's — gains here tend to stick.

The Documented Playbook

Unblock the crawlers. Allow PerplexityBot in robots.txt and allowlist the published IPs in your CDN/WAF. This is the only lever Perplexity itself documents.
Keep doing real SEO. Rankings correlate with citations here more than on any other AI platform. Factual, well-structured pages that rank are the core asset.
Build the third-party layer. A YouTube presence, genuinely useful Reddit participation, an accurate Wikipedia/Wikidata footprint, and (for B2B) LinkedIn and G2 profiles cover the surfaces that supply >50% of citations.
Make claims extractable. The GEO research line (Aggarwal et al., KDD 2024) found quotations, statistics and cited sources the top-performing content tactics in lab benchmarks — up to ~40% visibility gain. Structure announcements as clear, quotable facts. Our Citability Checker scores exactly this.
Skip the folk remedies. No official support exists for llms.txt or schema.org as Perplexity ranking signals; testing suggests structured data is read as plain text. Don't buy "Perplexity optimization" built on either claim.
Publishers: take the money. The Publishers' Program shares ad revenue on cited content; Comet Plus (live October 2025) shares 80% of subscription revenue from a $42.5M initial pool across visits, citations and agent actions.

Honesty Box: What's Unverified

"Perplexity handles 1.2–1.5B queries/month in 2026" — extrapolation; the last confirmed figure is 780M/month (May 2025).
"Perplexity supports llms.txt" — no official statement exists.
Average citations per answer (~5–10 claimed) — not reliably documented.
Context worth knowing: Perplexity faces active copyright suits (Dow Jones/NY Post, Nikkei, Britannica, NYT) — the publisher-relations picture is evolving.

Frequently Asked Questions

How does Perplexity choose its citations?

It retrieves from its own PerplexityBot-built index and selects sources that disproportionately overlap with strong Google rankings (28.6% from Google's top 10 — the highest of any AI engine) plus heavy third-party platform weighting: YouTube, Reddit and Wikipedia together supply over half of all citations.

How do I check if Perplexity can crawl my site?

Confirm PerplexityBot isn't disallowed in robots.txt, then check your CDN/WAF logs — Cloudflare and similar providers block AI crawlers by default. Perplexity publishes its bot IP ranges for allowlisting. Remember Perplexity-User (live fetches) generally ignores robots.txt, so answer-time fetching may still work even where indexing is blocked.

Is Perplexity worth optimizing for at 8% market share?

Yes, for two reasons: effort transfers (its citation signals overlap most with classic SEO, so the work pays everywhere), and its users are disproportionately researchers and buyers running comparison queries — including the AI agents researching PR pricing that we track on our own site.

What content formats does Perplexity cite most?

Pages with clear factual claims, statistics, and quotable statements; comparison and pricing content; and third-party validation surfaces (reviews, forums, video). Press releases structured for extraction — clear facts, figures, structured data as clean text — perform well, which is how Pressonify structures every release.

The Bottom Line

Perplexity rewards exactly the assets you should already be building: content that ranks, facts that quote cleanly, and a presence on the platforms buyers actually check. The only Perplexity-specific work is infrastructure (unblock the bots) and measurement — knowing whether you're actually cited. Pressonify publishes releases structured for AI extraction and tracks Perplexity citations closed-loop, so you get proof instead of hope.

Next: see how the other engines differ — ChatGPT Search, Google Gemini, Google AI Overviews — or the full AI search platform comparison.

How to Get Cited by Perplexity in 2026: Sources, Signals & Steps