Skip to content
+40 754.636.306 Start a project RO
All posts
Strategy 8 min read Published March 21, 2026 Updated May 21, 2026

How to get your business recommended by ChatGPT & Perplexity

AI-referred website sessions grew 527% year-over-year in 2025. Not a typo. People are no longer only typing queries into Google - they are asking ChatGPT, Perplexity, and Claude to find and recommend businesses for them. The referral pipeline from AI systems is growing faster than any channel since mobile.

Here is the question that matters: when someone asks ChatGPT "recommend a web agency in Romania," will your site appear in the answer? Traditional SEO optimizes for Google's crawlers. GEO - Generative Engine Optimization - optimizes for AI systems. These are not the same discipline. And in 2026, you need both.

527% YoY growth in AI referrals
844K+ Sites with llms.txt
2.5x Schema markup AI boost
+28% Freshness citation lift

From search engines to answer engines

Google's model has dominated discovery for two decades: crawl pages, rank them, present ten blue links, let users click through. That model is fragmenting. ChatGPT has over 300 million weekly active users. Perplexity serves millions of daily queries. Google's own AI Overviews now appear on more than half of all searches - often answering the question without a single click.

The shift is behavioral. When someone wants a recommendation - "best Romanian web agency," "top e-commerce developer in Timisoara," "who builds Astro sites in Romania" - they increasingly ask an AI directly instead of browsing ten links and evaluating manually. The AI synthesizes its answer from what it knows about the web and surfaces one or two options. If you are not one of those options, you are invisible.

This "zero-click" dynamic is not a threat to replace - it is a new layer on top. Sites that AI systems understand, trust, and can accurately summarize get recommended. Sites that AI systems cannot easily parse, or that present confusing or thin content, become invisible to this new discovery channel. The question is not whether to engage with it. The question is how.

What is llms.txt?

In September 2024, Jeremy Howard of Answer.AI proposed a new convention: a file called llms.txt placed at the root of your website, specifically designed to help large language models understand your site. Think of it as robots.txt, but instead of telling crawlers which pages to index or skip, it tells AI systems what your site is, what it does, and where its most important content lives.

The format is simple Markdown. An H1 title, a blockquote summary describing the site, and a series of H2 sections with markdown link lists pointing to your key pages. The companion file, llms-full.txt, goes further - it contains your complete site content in a single ingestible file, pre-processed for maximum AI consumption efficiency.

Why Markdown? Because LLMs consume it natively. There is no HTML parsing overhead, no token waste on navigation menus and cookie banners, no ambiguity from nested div structures. A well-constructed llms.txt hands the AI exactly what it needs in exactly the format it prefers.

Adoption has been fast. As of early 2026, over 844,000 websites have implemented llms.txt - including Anthropic, Cloudflare, Stripe, GitBook, and Mintlify. It is not yet an official standard, but it is rapidly becoming a de facto convention. The cost of implementation is negligible. The cost of being late is not.

The full GEO stack

llms.txt is one piece of a larger picture. Genuine AI discoverability requires getting several layers right simultaneously.

robots.txt for AI crawlers

AI crawlers operate in three distinct tiers, and this distinction matters enormously for your strategy. Training bots (GPTBot, anthropic-ai, Google-Extended) consume your content to improve AI models. Search and citation bots (OAI-SearchBot, Claude-SearchBot, PerplexityBot) fetch your pages to generate real-time answers and citations. User-triggered bots (ChatGPT-User, Claude-User) retrieve your pages on behalf of users who share links in AI conversations.

Here is what most people get wrong: blocking training bots does not block search citations. The tiers are independent. You can block GPTBot from training data while OAI-SearchBot still cites your pages in ChatGPT answers - or vice versa. For most businesses whose goal is visibility and recommendation, allowing all tiers is the correct strategy. Your content appearing in AI training data contributes to brand recognition. Your content appearing in citations drives direct traffic.

There are 14 known AI crawler user agents to configure: GPTBot, OAI-SearchBot, ChatGPT-User, anthropic-ai, Claude-SearchBot, Claude-User, PerplexityBot, Perplexity-User, Google-Extended, Gemini-User, YouBot, Meta-ExternalAgent, Meta-ExternalFetcher, and Applebot-Extended. A correctly configured robots.txt explicitly allows all of them.

Structured data (JSON-LD)

The impact of structured data on AI accuracy is dramatic. A study by Data World found that GPT-4 accuracy jumped from 16% to 54% when pages included proper schema markup. Content with correct structured data has a 2.5x higher probability of appearing in AI-generated answers.

The schemas that matter most for AI discoverability: Organization and LocalBusiness (who you are and where you operate), Service (what you offer, including pricing signals), FAQ (direct question-answer pairs that AI systems consume efficiently), and Article (for blog and editorial content). On a well-built website, these schemas are not optional extras - they are infrastructure.

Content freshness signals

AI systems weight recency as a ranking signal. Pages updated within the last two months earn 28% more AI citations than stale content. The mechanism is article:modified_time meta tags - a simple Open Graph property that signals to AI crawlers when your content was last meaningfully updated.

This is not about gaming timestamps. It is about committing to content maintenance. A service page that was last updated in 2023 signals outdated information regardless of its quality. A page updated this quarter signals active, current expertise. Both humans and AI systems respond to this signal.

Answer-first content structure

AI systems extract answers from your content. If the answer to a user's question is buried in paragraph seven of a long-form article, it may not surface. Place a direct, complete answer in the first 200 words of any content targeting informational queries. FAQ sections aligned with actual user prompts are high-leverage - write them as if you are directly answering the question someone would type into ChatGPT, not as marketing copy.

Content with statistics and citations achieves 30-40% higher visibility in AI responses. Every claim you make is more likely to be cited when it is supported by a named source, a specific number, or a study reference. This is not just good journalism - it is GEO strategy.

What we implemented - our case study

We implemented the full GEO stack on apexdigital.ro in preparation for this post. Here is exactly what we did and what we learned.

We created llms.txt at the site root, indexing all 84 pages across our English and Romanian versions. The file includes structured sections for our service categories, blog posts, and case studies - with direct links and short descriptions that give AI systems immediate context about each page's value.

We created llms-full.txt (116KB) - a single Markdown file containing the complete textual content of the site. When an AI system wants a comprehensive understanding of what APEX DIGITAL offers, it can ingest the full file in one request rather than crawling 84 separate pages. For AI-driven research queries, this dramatically improves how accurately we are represented.

We updated robots.txt with explicit Allow rules for all 14 known AI crawler user agents. Previously our robots.txt was silent on AI crawlers - neither blocking nor explicitly allowing. Explicit permission is better than implicit permission for citation bots that may be conservative in ambiguous cases.

We added article:modified_time meta tags to all blog and case study pages. Our blog content was already fresh, but the timestamps needed to be machine-readable for AI systems to act on them.

What we already had in place: Organization and LocalBusiness JSON-LD, Service and FAQ schema on all 25 service pages, and hreflang for our English and Romanian versions. Building a static Astro site from the start meant our pages are fast, clean, and AI-parseable with minimal noise.

One critical lesson from implementation: use ASCII-only characters in llms.txt. Romanian diacritics, curly quotes, and em-dashes cause encoding issues across different AI systems. Stick to plain ASCII for maximum compatibility across all AI crawlers, regardless of their character set handling.

Content strategies for AI recommendations

Technical implementation gets you into the game. Content strategy wins it. Here is what actually moves the needle for AI-generated recommendations.

Named, credentialed authors. AI systems surface E-E-A-T (Experience, Expertise, Authoritativeness, Trustworthiness) signals. Content attributed to a specific person with a professional profile - not just "The Team" - carries more authority. This is doubly true for technical and advisory content.

FAQ sections that mirror AI prompts. Think about the exact phrasing someone would use when asking ChatGPT a question about your industry. "How much does a website cost in Romania?" "What is the difference between WordPress and Astro?" Write FAQ entries that answer those exact questions. The closer your FAQ matches the user's prompt, the more likely an AI system is to surface your answer.

Consensus signals. AI systems look for consistent brand presence across multiple sources. Your site, your Google Business Profile, Clutch, DesignRush, LinkedIn, and any industry directories should all describe your business consistently. Inconsistency creates ambiguity - and AI systems resolve ambiguity by ignoring the conflicting source.

Third-party validation. Reviews on Clutch, G2, or industry-specific platforms are citation gold for AI systems. A claim you make about yourself is one signal. The same claim made by a third-party review platform is a much stronger signal.

Original data and benchmarks. The single most AI-citeable type of content is original research with specific numbers. Case studies with concrete metrics, benchmark reports, or survey data your business produces are highly likely to be cited. Generic marketing content is not. Our marketing services include helping clients produce and distribute this type of content.

Publishing consistency. Aim for at minimum two pieces of substantive content per week. AI systems weight recency and consistency. A site that published 40 posts in 2022 and nothing since registers as potentially dormant. A site publishing regularly in 2026 registers as active and current.

GEO is a layer, not a replacement

Nothing in this article replaces SEO. Google's blue links still drive the majority of web traffic. Structured data, quality content, page speed, and authority signals serve both traditional search and AI discoverability simultaneously. GEO is an additive layer - it extends your existing SEO investment rather than competing with it.

But the cost of ignoring it is growing every quarter. AI referral traffic grew 527% last year. That growth rate does not slow - it compounds. The businesses that implement GEO now build AI authority while their competitors are still debating whether it matters. By the time this becomes an obvious priority, the early movers will have a lead that takes months to close.

The implementation cost is low. A well-structured llms.txt takes a few hours. Updated robots.txt rules take minutes. Adding article:modified_time to your templates is a one-line change. The technical barrier is negligible. What separates businesses that capture AI referral traffic from those that do not is almost entirely a question of whether they acted.

If your competitors implement this and you do not, they get recommended and you do not. The AI answer box has limited real estate. It is not a ten-blue-links situation. There is one answer, sometimes two. You want to be that answer.

Ready to run the full audit on your own site? The 2026 GEO Audit is the 47-check framework APEX runs on every client engagement, organized into seven sections from crawl access to monitoring. It is the implementation companion to this overview.

Frequently asked questions

What is llms.txt and do I need one?

llms.txt is a Markdown file placed at the root of your website that helps AI systems understand your site structure and content. Proposed by Jeremy Howard of Answer.AI in September 2024, it has been adopted by over 844,000 websites including Anthropic, Cloudflare, and Stripe. While not yet an official standard, it is rapidly becoming a de facto convention. If your business benefits from being discovered and recommended by AI assistants like ChatGPT, Perplexity, or Google AI Overviews, then yes - you should have one.

Will AI crawlers ignore my site if I don't have llms.txt?

No. AI systems crawl and index websites regardless of whether llms.txt exists. However, llms.txt makes your content significantly easier for AI to parse and understand. Think of it like the difference between giving someone a well-organized briefing document versus asking them to read your entire website. Both work, but the briefing document leads to more accurate and favorable representation.

Should I block AI training crawlers in robots.txt?

For most businesses seeking visibility, no. AI crawlers operate in three tiers: training bots (GPTBot, anthropic-ai), search bots (OAI-SearchBot, Claude-SearchBot), and user-triggered bots (ChatGPT-User). Blocking training bots does not block search citations - the tiers are independent. If your content being used to train AI models benefits your brand visibility, allowing all tiers is generally the right strategy.

How do I know if AI systems are recommending my website?

Test directly: ask ChatGPT, Perplexity, and Google AI Overviews questions that should surface your business. For example, "recommend a web agency in [your city]" or "best [your service] provider in [your country]." Tools like Otterly and Peec AI can automate this monitoring. Google Search Console is also beginning to surface AI Overviews impression data.

What is the difference between SEO and GEO?

SEO (Search Engine Optimization) focuses on ranking in traditional search results - Google's blue links. GEO (Generative Engine Optimization) focuses on being cited and recommended by AI systems like ChatGPT, Perplexity, and Google AI Overviews. They share foundations (structured data, quality content, authority signals) but GEO adds new requirements: llms.txt files, explicit AI crawler permissions, answer-first content structure, and freshness signals. You need both.

Want your website optimized for AI discovery?

We build sites that rank on Google AND get recommended by ChatGPT. From structured data to llms.txt to content strategy, we handle the full stack. Read how we proved it on our own site.