Beyond Deletion: What ChatGPT’s Use of Hidden Wikipedia Pages Reveals About AI Reputation

Wiki ghost pages

For years, Google has been the ultimate arbiter of online visibility. If a page didn’t appear in Google’s index, it effectively didn’t exist in the public eye. Brands, communicators, and reputation managers learned to play by Google’s rules — optimizing what could be found, fixing what was misleading, and deleting what was outdated.

But artificial intelligence is rewriting those rules in real time.

Recently, we made a surprising discovery that raises new questions about how AI systems like ChatGPT access and represent information — and what that means for brands. ChatGPT cited Wikipedia pages that not only weren’t indexed by Google but, in some cases, had been deleted from Wikipedia entirely months earlier.

In other words: ChatGPT appears to be referencing information that no longer exists on the open web.

This finding, while seemingly small, points to a much larger shift. It suggests that ChatGPT operates from a different kind of index — one not governed by Google or Bing, but by the model’s own memory and training data. And that has profound implications for online reputation.

1. The Discovery

At Five Blocks, we regularly analyze how information about companies and individuals appears across platforms — from search results to knowledge panels to AI-generated summaries.

During one of these analyses, we noticed something odd: ChatGPT was citing a Wikipedia page about a company that had been deleted from Wikipedia months earlier. Even more surprising, the page had never been indexed by Google — likely due to Wikipedia’s internal restrictions on certain pages and drafts.

In short, ChatGPT seemed to know something that should have been impossible for it to know.

When we tested further, we found similar examples. In some cases, ChatGPT referenced archived or draft Wikipedia pages that were not accessible through normal search. 

This means ChatGPT’s knowledge base includes content that is invisible to both users and search engines — a sort of ghost archive of the internet.

The image below from our AIQ platform shows an example of ChatGPT referencing a deleted Wikipedia page:

Here is an actual ChatGPT screenshot:

When you click on the link from ChatGPT, you get:

And, here you can see that the Wikipedia article was deleted on May 19th:

2. The Indexing Debate: Google, Bing… or Something Else?

There’s been a lot of discussion online about whether ChatGPT (and similar tools) rely on Google’s or Bing’s index to answer questions about current topics.

Microsoft has described Bing as ChatGPT’s “search partner,” suggesting that when the model browses the web in real time, it’s doing so through Bing’s infrastructure. Others assume that since Google dominates the indexing landscape, much of the information must come from its dataset.

Our finding suggests a third possibility.

AI systems like ChatGPT don’t just rely on live search indices — they also draw on their own internal knowledge, built from snapshots of the web taken during training. That means they may reference pages that have since disappeared, been updated, or were never fully visible to search engines at all.

For example, here ChatGPT cites the HMS Totnes Wikipedia page:

 

However, a Google search for that Wikipedia page shows that it hasn’t been indexed:

Unlike Google or Bing, which are constantly re-indexing the live web, AI models often retain information indefinitely. They may “remember” content that was once public but has long since vanished — effectively creating a parallel version of the internet that exists only inside the model.

This raises a fascinating question:
If AI can access and resurface information that’s been deleted or de-indexed, what does “control” over your online reputation really mean?

3. The Reputation Implications

For years, Wikipedia has been one of the most powerful determinants of how brands and individuals appear online. It influences Google Knowledge Panels, affects trust signals, and often shapes media narratives.

Because of that, many organizations have worked diligently to ensure that their Wikipedia entries are accurate, balanced, and aligned with verifiable facts.  But the assumption has always been: once an error is corrected or a page is deleted, the outdated version fades from public view. In the age of AI, that’s no longer guaranteed.

If ChatGPT or another AI system has already learned from a previous version of a Wikipedia page, that information may continue to influence its responses — even if the page no longer exists. This creates a temporal lag between what’s true now and what AI believes to be true, based on past data.

That lag can have real consequences:

  • A company that successfully removes an inaccurate Wikipedia claim may still see it resurface in AI summaries.
  • An individual whose biography was corrected might find outdated information repeated in generative search results.
  • A brand that relies on Wikipedia for credibility could see outdated or partial content influencing how AI describes it to users.

This represents a profound shift in the mechanics of reputation. Reputation is no longer defined solely by what’s visible online — it’s also shaped by what AI remembers.

This raises an important question: if outdated Wikipedia content can continue to surface in AI responses, is there still value in correcting or deleting a page? The answer is yes — but with new strategic considerations. When an updated version of a page is published, AI systems that re-crawl or refresh their training data are more likely to replace older information with the corrected version. And even if outdated details aren’t fully overwritten everywhere, ensuring that accurate, high-quality content exists increases the probability that AI models will surface the correct version in most contexts. In other words, maintaining an accurate Wikipedia presence still matters — it just operates within a more complex, probabilistic AI ecosystem.

4. Managing Reputation in the Age of AI Memory

So, what should brands and communicators do in this new landscape?

First, recognize that deletion isn’t disappearance.
Once information has entered the public digital ecosystem — especially on high-visibility platforms like Wikipedia — it may continue to circulate in AI models long after being removed. That makes proactive accuracy and clarity even more important. Fixing misinformation quickly reduces the risk that it becomes “baked in” to an AI’s memory.

Second, expand your visibility monitoring beyond search.
Traditional SEO and reputation tools focus on what appears in Google results. But as AI platforms like ChatGPT, Perplexity, and Google’s own AI Overviews become more popular, brands need to track how they’re represented there, too. Five Blocks’ AIQ Snapshot, for example, measures how companies appear across leading AI platforms — providing early warning when narratives diverge from reality.

Third, view Wikipedia through an AI lens.
Wikipedia remains one of the most influential data sources in the world — not just for humans, but for machines. Maintaining accuracy, neutrality, and completeness there matters more than ever, because what’s written (or once written) can echo through AI systems long after.

Finally, stay vigilant about every change that happens on your Wikipedia pages

Even small edits — a phrasing shift, a new citation, an added controversy — can ripple into AI systems that use Wikipedia as a core reference. That makes real-time monitoring essential. Tools like Five Blocks’ WikiAlertsTM notify brands the moment a page is edited, enabling rapid review and response before inaccurate or biased information spreads or becomes part of an AI model’s reference set. In the age of AI memory, staying updated isn’t just good Wikipedia hygiene — it’s a critical layer of reputation protection.

5. The Takeaway: Reputation in a Post-Search World

The discovery that ChatGPT can cite deleted or unindexed Wikipedia pages is more than a technical curiosity. It’s a signal that we’re entering a post-search era — one where visibility and influence extend beyond the reach of traditional SEO and content control.

Search engines like Google and Bing show us what’s out there today. AI models, in contrast, reveal what’s still in there — the accumulated memory of the internet, with all its imperfections, edits, and ghosts of pages past.

For communicators and reputation professionals, this is both a challenge and an opportunity. It’s a reminder that reputation isn’t static, and it isn’t limited to what’s live. It’s a living narrative, shaped by both current content and the digital traces left behind.

At Five Blocks, we believe the next chapter of reputation management lies in understanding and influencing that AI layer — ensuring that when machines summarize who you are, they get the story right.  

Curious whether ChatGPT is pulling from non-indexed or deleted pages that could be influencing your brand narrative? Get an AIQ Snapshot now!

Error: Contact form not found.

Skip to content