Block GPTBot, Not OAI-SearchBot: Robots.txt Guide

TL;DR

OAI-SearchBot fetches live data for ChatGPT answers and should be allowed.
GPTBot scrapes web data for training future AI models and can be blocked.
ChatGPT-User is a third bot that acts on behalf of a user browsing the web.
Use a specific User-agent rule in robots.txt to block GPTBot while allowing the others.

Blocking OpenAI's web crawlers is a popular topic, but most guides get it wrong. They advise a blanket ban that hurts your visibility in ChatGPT answers. The correct approach is surgical. You need to block the training bot while giving full access to the bot that generates live search results.

This guide provides the exact robots.txt configuration to allow OAI-SearchBot and GPTBot while blocking GPTBot. This lets you prevent your content from being used for model training without sacrificing your ability to be cited in real-time ChatGPT responses.

The Exact robots.txt Configuration to Allow OAI-SearchBot (While Blocking GPTBot)

Here is the precise code. Place this in your yourdomain.com/robots.txt file. The order is important. Specific rules must come before general ones.

# Block OpenAI's training bot
User-agent: GPTBot
Disallow: /

# Allow OpenAI's live search bot
User-agent: OAI-SearchBot
Allow: /

How This Configuration Works

The robots.txt file works on a "first match" principle for a specific user-agent.

User-agent: GPTBot: When the GPTBot crawler visits your site, it finds this specific block first. The Disallow: / command tells it not to crawl any page on your site. It stops processing and leaves.
User-agent: OAI-SearchBot: When the OAI-SearchBot crawler visits, it skips the GPTBot block because the user-agent does not match. It finds its own specific block. The Allow: / command gives it permission to crawl everything.

This setup achieves the goal perfectly. You opt out of training data collection but remain eligible for inclusion in live ChatGPT answers that use web search.

Why You Must Differentiate Between OpenAI Bots

Failing to distinguish between OpenAI's crawlers is a critical AEO mistake in 2026. They serve completely different functions. Treating them the same means you either give away your content for free model training or you disappear from ChatGPT's search results entirely.

Understanding OpenAI's User-Agents

OpenAI uses at least three distinct crawlers, each with a specific job. A blanket Disallow for all of them is a mistake.

User-Agent	Primary Function	Should You Allow It?
GPTBot	Data collection for training future AI models.	No (Optional)
OAI-SearchBot	Real-time web retrieval to answer user prompts in ChatGPT.	Yes (Critical)
ChatGPT-User	Acts on behalf of a user browsing a specific page via a GPT.	Yes (Critical)

GPTBot: This is the bot that scrapes the web to feed OpenAI's training datasets. Blocking this bot prevents your content from being used to build future versions of their models. There is no direct, immediate traffic or citation benefit from allowing it. For a deeper look, you can read a complete guide to OpenAI web crawlers.
OAI-SearchBot: This is the bot that matters for visibility. When a ChatGPT user asks a question that requires current information, this bot performs a live web search. If your site is blocked to this bot, you cannot be cited as a source in the answer.
ChatGPT-User: This bot is triggered when a user in ChatGPT clicks a link or asks a GPT to visit a specific URL. Blocking this bot breaks the user experience and prevents them from accessing your content through the AI interface.

The distinction is clear. GPTBot is for OpenAI's benefit. OAI-SearchBot and ChatGPT-User are for the user's benefit, which in turn benefits you through citations and traffic.

AEO God Mode — Free WordPress Plugin Get your site cited by ChatGPT, Perplexity, and Google AI Overviews. Install in under 5 minutes.

Download Free

Pro Tip

Check your server’s raw access logs for “GPTBot” and “OAI-SearchBot”. This is the only way to confirm which bots are actually visiting your site and how often. Don’t just assume they are visiting; verify it with data.

How to Implement and Verify in WordPress

You can edit your robots.txt file directly, but using a plugin is safer and avoids syntax errors.

Manual Method

If you use an FTP client, you can find the robots.txt file in the root directory of your WordPress installation. If it doesn't exist, you can create a new plain text file named robots.txt and upload it. Add the rules exactly as shown above.

Some traditional SEO plugins like Yoast or Rank Math also provide a built-in editor for the robots.txt file. You can find this in their settings or tools section.

Automated Method with AEO God Mode

A dedicated AEO plugin handles this automatically. The AEO God Mode plugin includes an AI Crawler Allowlist module. It identifies 18 different AI crawlers, including all three from OpenAI.

You can simply toggle GPTBot to "off" while leaving OAI-SearchBot and ChatGPT-User "on". The plugin will generate the correct, optimized robots.txt file for you without any manual editing. This removes the risk of a typo taking your site offline for all crawlers.

Verifying Your Configuration

After you've updated your robots.txt file, you need to confirm it's working.

Google Search Console: Google has a robots.txt Tester tool. You can paste your rules into it and test them against different user-agents (like GPTBot) and URLs to see if they would be allowed or blocked.
Server Logs: The most reliable method is to check your server logs for AI bot traffic. After a few days, you should see entries for OAI-SearchBot but none for GPTBot. Some plugins, like the AEO God Mode crawler log, provide a clean dashboard view of this data inside WordPress.

The Broader Context: robots.txt vs. llms.txt

Your robots.txt file is just one tool for managing AI crawlers. It's a simple allow or disallow instruction. An emerging, more detailed method is the llms.txt file.

While robots.txt says "enter" or "do not enter," llms.txt provides a detailed roadmap for crawlers that are allowed in. It can suggest which content is most important, define usage policies, and provide contact information. The two files work together. You can learn more about the differences between llms.txt vs. robots.txt for managing AI.

For now, robots.txt remains the universally respected standard for controlling crawler access. Getting these rules right is a foundational step for any serious Answer Engine Optimization strategy.

Frequently Asked Questions

No. Blocking GPTBot only prevents your content from being used for future model training. Visibility in current ChatGPT answers is handled by OAI-SearchBot, which you should allow.

AEO God Mode — Free WordPress Plugin Get your site cited by ChatGPT, Perplexity, and Google AI Overviews. Install in under 5 minutes.

Download Free

No, not unless you have a default-deny policy. Most robots.txt files have an implicit "allow" for any user-agent not specifically mentioned. You only need to explicitly Allow a bot if you have a broad Disallow rule that might otherwise block it.

They are similar in function but for different companies. Google-Extended is used by Google for training its AI models like Gemini. Like GPTBot, you can choose to block it without affecting your ranking in Google's regular or AI-powered search results.

The AI crawler landscape changes, but slowly. Check for new major user-agents like those from Apple or Meta every 6-12 months. The core OpenAI bots (GPTBot, OAI-SearchBot) are unlikely to change their names or functions soon.

Block GPTBot, Not OAI-SearchBot: A robots.txt Guide

TL;DR

The Exact robots.txt Configuration to Allow OAI-SearchBot (While Blocking GPTBot)

How This Configuration Works

Why You Must Differentiate Between OpenAI Bots

Understanding OpenAI's User-Agents

How to Implement and Verify in WordPress

Manual Method

Automated Method with AEO God Mode

Verifying Your Configuration

The Broader Context: robots.txt vs. llms.txt

Frequently Asked Questions