What Grokipedia Really Says About the Fortune 100
As tools like Grokipedia begin to sit alongside Wikipedia, search results, and AI assistants, they are no longer just reflecting public information—they are actively shaping it. To understand what this shift means in practice, we analyzed how Grokipedia represents the Fortune 100, digging into nearly 19,000 sources to see where the information comes from, what gets emphasized, and how this new, AI-curated model differs from the human-edited standard brands have relied on for years.
Why Grokipedia matters
Grokipedia is making a bold wager: replacing Wikipedia’s human editors with algorithmic curation. Instead of volunteer consensus and strict sourcing rules, Grokipedia relies on AI systems to decide what information matters and which sources are credible.
That shift raises an important question for brands and communicators:
What actually happens to corporate narratives when AI, not humans, becomes the editor?
The big picture: more sources, different priorities
Grokipedia articles are not necessarily longer than their Wikipedia counterparts. But they are far more heavily sourced.
- Average sources per Fortune 100 company: 187
- Highest count: Apple, with 359 sources
This volume alone signals something important: Grokipedia’s model values aggregation and breadth over editorial restraint. But the real story lies in where those sources come from.
Where Grokipedia gets its information
Across all Fortune 100 articles, sources break down as follows:
- Corporate sources: 34.5% (6,477 citations)
- Company websites: 27.2% (5,112 citations)
- News media: 16.3% (3,050 citations)
- Government: 7.0% (1,319 citations)
- Financial / Academic / Trade: 7.5% (1,408 citations)
- Social media: 1.8% (335 citations)
This alone marks a major departure from Wikipedia, which treats corporate and self-published sources as inherently suspect. Grokipedia does the opposite.
The most-cited domains across the Fortune 100
Here are the top 10 domains cited across all Fortune 100 Grokipedia pages:
- Reuters (474)
- Yahoo (382)
- SEC.gov (280)
- New York Times (248)
- CNBC (200)
- Macrotrends (185)
- Forbes (150)
- Lockheed Martin (company website) (125)
- Justice.gov (122)
- Published reports (115)
What surprised us most
A company website cracked the top 10 overall.
Lockheed Martin’s corporate site appears 125 times across Grokipedia—but not because it’s cited broadly. It’s because Grokipedia’s Lockheed Martin page relies heavily on the company’s own materials.
Even more striking:
- TD Synnex sources 60.8% of its Grokipedia citations from its own website
This represents a fundamental break from Wikipedia’s editorial philosophy. Where Wikipedia views corporate sites as biased and secondary, Grokipedia treats them as primary sources of truth.
Social media: present, but marginal
Early critics worried Grokipedia would overweight X (formerly Twitter). The data tells a different story.
- Social media accounts for under 2% of all citations
- LinkedIn leads with 75 citations
- Facebook follows with 40
- Reddit appears 38 times
- X.com appears just 9 times
Despite Grokipedia’s ownership, social platforms are not driving corporate narratives at scale.
Press releases are fair game
Another sharp contrast with Wikipedia:
- PR Newswire ranks #7 among news sources, with 95 citations
Wikipedia would never cite a press release. Grokipedia treats them as legitimate reference material—giving companies a direct channel into AI-generated “encyclopedic” content.
What this means for communications teams
The shift from human-curated to AI-curated knowledge changes the playbook.
1. Your corporate website now matters far beyond branded search
It is no longer just a destination—it can become the primary source AI systems use to describe your company across countless queries.
2. Government filings carry outsized influence
SEC.gov (#3) and Justice.gov (#9) rank among the most-cited domains. AI systems read and index everything, including filings, enforcement actions, and regulatory language.
3. Earned media still matters—but the mix is changing
Reuters and legacy outlets dominate, but trade publications and financial data aggregators (440 citations combined) remain highly influential.
4. Wikipedia is no longer the single source of record
Fixing Wikipedia is no longer sufficient. Grokipedia and similar AI-driven platforms are building parallel narratives, governed by entirely different editorial logic.
Explore the data yourself
We’ve made the full interactive dashboard available here so you can explore Grokipedia sourcing patterns company by company.
Beyond Deletion: What ChatGPT’s Use of Hidden Wikipedia Pages Reveals About AI Reputation
For years, Google has been the ultimate arbiter of online visibility. If a page didn’t appear in Google’s index, it effectively didn’t exist in the public eye. Brands, communicators, and reputation managers learned to play by Google’s rules — optimizing what could be found, fixing what was misleading, and deleting what was outdated.
But artificial intelligence is rewriting those rules in real time.
Recently, we made a surprising discovery that raises new questions about how AI systems like ChatGPT access and represent information — and what that means for brands. ChatGPT cited Wikipedia pages that not only weren’t indexed by Google but, in some cases, had been deleted from Wikipedia entirely months earlier.
In other words: ChatGPT appears to be referencing information that no longer exists on the open web.
This finding, while seemingly small, points to a much larger shift. It suggests that ChatGPT operates from a different kind of index — one not governed by Google or Bing, but by the model’s own memory and training data. And that has profound implications for online reputation.
1. The Discovery
At Five Blocks, we regularly analyze how information about companies and individuals appears across platforms — from search results to knowledge panels to AI-generated summaries.
During one of these analyses, we noticed something odd: ChatGPT was citing a Wikipedia page about a company that had been deleted from Wikipedia months earlier. Even more surprising, the page had never been indexed by Google — likely due to Wikipedia’s internal restrictions on certain pages and drafts.
In short, ChatGPT seemed to know something that should have been impossible for it to know.
When we tested further, we found similar examples. In some cases, ChatGPT referenced archived or draft Wikipedia pages that were not accessible through normal search.
This means ChatGPT’s knowledge base includes content that is invisible to both users and search engines — a sort of ghost archive of the internet.
The image below from our AIQ platform shows an example of ChatGPT referencing a deleted Wikipedia page:
Here is an actual ChatGPT screenshot:
When you click on the link from ChatGPT, you get:
And, here you can see that the Wikipedia article was deleted on May 19th:
2. The Indexing Debate: Google, Bing… or Something Else?
There’s been a lot of discussion online about whether ChatGPT (and similar tools) rely on Google’s or Bing’s index to answer questions about current topics.
Microsoft has described Bing as ChatGPT’s “search partner,” suggesting that when the model browses the web in real time, it’s doing so through Bing’s infrastructure. Others assume that since Google dominates the indexing landscape, much of the information must come from its dataset.
Our finding suggests a third possibility.
AI systems like ChatGPT don’t just rely on live search indices — they also draw on their own internal knowledge, built from snapshots of the web taken during training. That means they may reference pages that have since disappeared, been updated, or were never fully visible to search engines at all.
For example, here ChatGPT cites the HMS Totnes Wikipedia page:
However, a Google search for that Wikipedia page shows that it hasn’t been indexed:
Unlike Google or Bing, which are constantly re-indexing the live web, AI models often retain information indefinitely. They may “remember” content that was once public but has long since vanished — effectively creating a parallel version of the internet that exists only inside the model.
This raises a fascinating question:
If AI can access and resurface information that’s been deleted or de-indexed, what does “control” over your online reputation really mean?
3. The Reputation Implications
For years, Wikipedia has been one of the most powerful determinants of how brands and individuals appear online. It influences Google Knowledge Panels, affects trust signals, and often shapes media narratives.
Because of that, many organizations have worked diligently to ensure that their Wikipedia entries are accurate, balanced, and aligned with verifiable facts. But the assumption has always been: once an error is corrected or a page is deleted, the outdated version fades from public view. In the age of AI, that’s no longer guaranteed.
If ChatGPT or another AI system has already learned from a previous version of a Wikipedia page, that information may continue to influence its responses — even if the page no longer exists. This creates a temporal lag between what’s true now and what AI believes to be true, based on past data.
That lag can have real consequences:
- A company that successfully removes an inaccurate Wikipedia claim may still see it resurface in AI summaries.
- An individual whose biography was corrected might find outdated information repeated in generative search results.
- A brand that relies on Wikipedia for credibility could see outdated or partial content influencing how AI describes it to users.
This represents a profound shift in the mechanics of reputation. Reputation is no longer defined solely by what’s visible online — it’s also shaped by what AI remembers.
This raises an important question: if outdated Wikipedia content can continue to surface in AI responses, is there still value in correcting or deleting a page? The answer is yes — but with new strategic considerations. When an updated version of a page is published, AI systems that re-crawl or refresh their training data are more likely to replace older information with the corrected version. And even if outdated details aren’t fully overwritten everywhere, ensuring that accurate, high-quality content exists increases the probability that AI models will surface the correct version in most contexts. In other words, maintaining an accurate Wikipedia presence still matters — it just operates within a more complex, probabilistic AI ecosystem.
4. Managing Reputation in the Age of AI Memory
So, what should brands and communicators do in this new landscape?
First, recognize that deletion isn’t disappearance.
Once information has entered the public digital ecosystem — especially on high-visibility platforms like Wikipedia — it may continue to circulate in AI models long after being removed. That makes proactive accuracy and clarity even more important. Fixing misinformation quickly reduces the risk that it becomes “baked in” to an AI’s memory.
Second, expand your visibility monitoring beyond search.
Traditional SEO and reputation tools focus on what appears in Google results. But as AI platforms like ChatGPT, Perplexity, and Google’s own AI Overviews become more popular, brands need to track how they’re represented there, too. Five Blocks’ AIQ Snapshot, for example, measures how companies appear across leading AI platforms — providing early warning when narratives diverge from reality.
Third, view Wikipedia through an AI lens.
Wikipedia remains one of the most influential data sources in the world — not just for humans, but for machines. Maintaining accuracy, neutrality, and completeness there matters more than ever, because what’s written (or once written) can echo through AI systems long after.
Finally, stay vigilant about every change that happens on your Wikipedia pages.
Even small edits — a phrasing shift, a new citation, an added controversy — can ripple into AI systems that use Wikipedia as a core reference. That makes real-time monitoring essential. Tools like Five Blocks’ WikiAlerts™ notify brands the moment a page is edited, enabling rapid review and response before inaccurate or biased information spreads or becomes part of an AI model’s reference set. In the age of AI memory, staying updated isn’t just good Wikipedia hygiene — it’s a critical layer of reputation protection.
5. The Takeaway: Reputation in a Post-Search World
The discovery that ChatGPT can cite deleted or unindexed Wikipedia pages is more than a technical curiosity. It’s a signal that we’re entering a post-search era — one where visibility and influence extend beyond the reach of traditional SEO and content control.
Search engines like Google and Bing show us what’s out there today. AI models, in contrast, reveal what’s still in there — the accumulated memory of the internet, with all its imperfections, edits, and ghosts of pages past.
For communicators and reputation professionals, this is both a challenge and an opportunity. It’s a reminder that reputation isn’t static, and it isn’t limited to what’s live. It’s a living narrative, shaped by both current content and the digital traces left behind.
At Five Blocks, we believe the next chapter of reputation management lies in understanding and influencing that AI layer — ensuring that when machines summarize who you are, they get the story right.
Curious whether ChatGPT is pulling from non-indexed or deleted pages that could be influencing your brand narrative? Get an AIQ Snapshot now!








