How Perplexity picks its sources
The Perplexity source selection algorithm is not public, but empirical observation on 5000+ analyzed responses reveals a consistent four-step logic. Understanding this logic transforms a GEO strategy from intuitive to directed — you know exactly what to do for your source to be retained.
Step 1 — Query expansion
The user prompt is reformulated into 3-5 web sub-queries by the Sonar LLM. Example: "best US ESG asset manager" becomes "top US asset managers ESG ratings 2026", "US ESG asset management leaders", "sustainable asset managers US AUM". Brand implication: your content must rank on semantic variations, not just the exact keyword.
Step 2 — Multi-source crawl
Each sub-query is executed against the Perplexity web index (combining its own crawl + partnerships with engines like Bing). 30-50 results are retrieved. Implication: your site must be crawlable by PerplexityBot AND Bingbot (often forgotten). Check robots.txt and submit your site to Bing Webmaster Tools.
Step 3 — Ranking by authority + relevance
The 30-50 results are reranked by: domain authority (PageRank-like, Wikipedia/.edu/established-press bias), recency for time-sensitive queries, semantic relevance (question embedding vs page), content structure (structured data, lists, clear headers preferred). The 5-10 finalists feed the LLM context.
Step 4 — Extraction and synthesis
The 5-10 best sources are passed to the LLM (Sonar or Pro) which writes the answer attaching each sentence to 1-3 sources. A brand mentioned in the synthesis was extracted from at least one of these 5-10 sources. Implication: to appear mentioned, two doors — be one of the sources OR be mentioned within a source.
Source profile that ranks well
Established domain (10+ years), >50k monthly organic traffic, factual structured content, regular updates. Wikipedia ticks every box — hence its systematic over-representation (32 % of cross-LLM citations). Recent corporate sites with narrative content and weak organic traffic are excluded at ranking.
Perplexity US B2B source distribution (Q1 2026)
Wikipedia 35 % · Bloomberg 22 % · Reuters 16 % · Pensions&Investments 13 % · Barron's 9 % · rest 5 %. For US asset management, these 5 sources cover 80 % of Perplexity authority.
Implication for your strategy
Identifying the 5-10 most-cited sources in YOUR industry is the first strategic action. Do it via your GEO tool or by manually analyzing 50 Perplexity responses on industry prompts. Once identified, your third-party authority strategy becomes targeted: prioritize presence on these 5-10 specific sources rather than spreading effort.
Difference with Google
Google ranks 10 results. Perplexity ranks 30-50, retains 5-10, cites 3-5 in the final answer. This double reduction explains why Perplexity citation rate is more binary: you're either in the retained top 5-10 or invisible. No position 8-15 that drives some traffic like Google.