Fix 'Not Crawled by AI' Errors in Google Search Console

TL;DR

The “Not Crawled by AI” error means Google-Extended, the crawler for Gemini, is blocked or skipping your pages.
The most common cause is a missing “Allow” rule for the Google-Extended user-agent in your robots.txt file.
This error can also signal content quality issues, like a lack of E-E-A-T signals or poor semantic structure.
Fixing it requires both a technical check of robots.txt and a content audit to ensure your pages are valuable for AI training.

One day your site is a trusted source for AI; the next, Google Search Console flags half your content with a new, worrying status: "Not Crawled by AI". This isn't a typical indexing problem. It’s a sign that your content is invisible to the AI models powering Google's future, like Gemini and AI Overviews. Successfully fixing "Not Crawled by AI" errors in Google Search Console is about more than just tweaking a file. It’s about proving your content is worth the crawl.

This error tells you one of two things. Either you have a direct technical block, or Google's AI systems have reviewed your page and decided it lacks the value needed for training their models. We will cover how to diagnose and fix both scenarios.

What "Not Crawled by AI" Actually Means in 2026

This status is new and specific. It relates only to the Google-Extended user-agent. This crawler is used to collect training data for Google's AI models, including Gemini.

Crucially, this error does not affect your traditional Google Search rankings. A page can rank perfectly fine in the classic blue links while still being ignored by Google's AI crawlers. The block only impacts your visibility within AI-generated answers and your content's inclusion in future model training.

The error falls into two categories:

Hard Block: Your robots.txt file explicitly or implicitly disallows the Google-Extended crawler. This is a direct instruction that Google will obey.
Soft Block (De-prioritization): Your robots.txt allows access, but Google chooses not to crawl the page. This is a quality signal. The crawler has determined the content is not valuable enough to spend resources on for AI training purposes.

The First Step: Fixing "Not Crawled by AI" Errors in Google Search Console with `robots.txt`

Before you touch your content, you must rule out the simple technical block. The Google-Extended user-agent is separate from the standard Googlebot and requires its own explicit "Allow" directive in your robots.txt file.

Many website owners assume allowing Googlebot is enough. It is not. You must add the following lines to your robots.txt file, which is located at yourdomain.com/robots.txt.

User-agent: Google-Extended
Disallow:

Leaving the Disallow: field blank for the Google-Extended user-agent grants it full access to your site. This single change resolves the majority of "Not Crawled by AI" errors. If you want to learn more about this specific bot, you can read a full guide on what Google-Extended is and why it matters.

AEO God Mode — Free WordPress Plugin Get your site cited by ChatGPT, Perplexity, and Google AI Overviews. Install in under 5 minutes.

Download Free

The AEO God Mode plugin's AI Crawler Allowlist module handles this automatically. It detects 18 different AI crawlers and correctly formats your robots.txt file to grant access without manual editing.

Beyond `robots.txt`: When the Block Isn't Technical

If your robots.txt is correct but you still see errors, the problem lies with your content's perceived value. AI crawlers operate on a "value budget," not just a crawl budget. They are sent to find content with strong signals of Experience, Expertise, Authoritativeness, and Trust (E-E-A-T).

Data shows that 96% of citations in Google AI Overviews come from sources with high E-E-A-T. If your pages lack these signals, Google-Extended will pass them over.

Key areas to audit include:

Author Attribution: Is the content written by a named author with a clear bio?
Factual Density: Are your claims backed by data and links to authoritative sources?
Originality: Does the page contain original research, data, or insights not found elsewhere?

Improving your E-E-A-T often starts with clear authorship. A complete guide on setting up author schema in WordPress for 2026 provides the technical steps to signal expertise directly to crawlers.

Pro Tip

Start with your pages that GSC shows have high AI query probability. The AEO God Mode GSC Integration module identifies these automatically. Fixing these pages first gives you the fastest path to resolving crawl issues and getting cited.

The Multi-Modal Content Requirement

AI models in 2026 are multi-modal. They process text, images, and video together. The single strongest predictor of AI citation is Multi-Modal Content Integration, with a correlation score of r=0.92.

Pages that combine text with original images and short explainer videos are up to 317% more likely to be crawled and cited. Google-Extended actively seeks this content to train its models. A page with only text is seen as incomplete and is a prime candidate for a "Not Crawled by AI" soft block.

To fix this, review your most important pages.

Add original images with descriptive alt text. Avoid generic stock photos.
Embed a short (60-90 second) video that summarizes the key concept.
Ensure all visual assets are marked up with ImageObject and VideoObject schema.

This strategy is fundamental to how you can appear in Google AI Overviews.

Semantic Structure and Schema: Your AI Roadmap

An AI crawler needs a clear map to understand your content's structure and key takeaways. Vague, unstructured content is difficult to parse and is often skipped. You provide this map through clean HTML structure and detailed schema markup.

Structural Clarity: Use H2 and H3 headings for all main ideas. Use lists and HTML tables to structure comparative data. Listicles and tables account for 50% of all top AI citations.
Semantic Completeness: Ensure your content fully answers a query without ambiguity. AI prefers "Answer Islands," or self-contained passages of 130-167 words.
Schema Coverage: JSON-LD schema is the most direct way to explain your content to a machine. An automated schema engine can ensure every page has the correct markup, from Article to FAQPage.

A clear structure makes your content predictable and valuable.

AEO God Mode — Free WordPress Plugin Get your site cited by ChatGPT, Perplexity, and Google AI Overviews. Install in under 5 minutes.

Download Free

Signal	Attracts AI Crawlers	Repels AI Crawlers
Content Format	Text, original images, and video combined	Text-only articles
Schema	FAQPage, VideoObject, Person schema	Missing or invalid structured data
E-E-A-T	Named author with bio, outbound citations	Anonymous content, unsupported claims
Structure	Clear H2/H3s, HTML tables, lists	Large, unbroken blocks of text

The Challenge of JavaScript-Heavy Websites

Many modern websites rely heavily on JavaScript to render content. This creates a rich user experience but can make a site completely invisible to AI crawlers.

Pros

✓ Modern user experience
✓ Dynamic content loading
✓ Rich interactive elements

Cons

✗ 69% of AI crawlers cannot render JavaScript
✗ Key content may be invisible to Google-Extended
✗ Causes “Not Crawled by AI” due to empty pre-rendered HTML
✗ Requires server-side rendering (SSR) to fix

If your site uses a JavaScript framework like React or Vue, you must implement server-side rendering (SSR). SSR sends a fully-rendered HTML page to the crawler, ensuring all content is visible on the first pass. Without it, Google-Extended may see a blank page and report it as "Not Crawled by AI" because there was nothing of value to crawl.

Auditing Your Fixes: How to Verify AI Crawlers Are Visiting

After you've updated your robots.txt file and improved your content, you need to verify that Google-Extended is visiting your site. While Google Search Console data will eventually update, it can lag by days or weeks.

The fastest method is to check your server's raw access logs. You or your hosting provider can access these logs and search for visits from the Google-Extended user-agent string. Seeing new entries after you've made your changes is a positive confirmation that the fix is working. You can also use a plugin to check which AI bots are crawling your site traffic directly from your WordPress dashboard.

Frequently Asked Questions

It is Google's web crawler used for training AI models like Gemini. It is separate from the main Googlebot and does not directly affect your traditional search rankings.

No, blocking it will not hurt your rankings in Google's classic blue-link search results. It will only prevent your content from being used to train Google's AI models and from being cited in Gemini or AI Overviews.

"Not Crawled by AI" refers specifically to the Google-Extended bot for AI training. "Crawled – currently not indexed" refers to the main Googlebot for search rankings and is a much more common indexing status issue.

Yes, the core version is free and includes the AI Crawler Allowlist to manage robots.txt automatically and a Schema Engine to add structured data AI crawlers need.

How to Fix “Not Crawled by AI” Errors in Google Search Console