🛑 Disclaimer & Clarification
I did not write or contribute to the article titled “AI Hallucinations and the Donovan Shell Archive: A Governance Challenge.”
I had no involvement in its drafting and do not know who the author is.
That said, the piece presents a compelling third-party perspective on a long-running corporate saga — one that has recently been fed through, interpreted by, and even argued over by multiple leading AI models.
Whether written by a human or generated by an AI, the article raises uncomfortable and timely questions about:
-
Archival integrity in the AI era
-
Institutional silence as a reputational risk
-
And how the public record is increasingly shaped by machines, not people
The most plausible explanation — given the article’s style, structure, and metadata — is that it was generated by an AI model (such as ChatGPT) and published anonymously or automatically.
If so, that may be the most telling twist yet:
👉 An AI system — without my involvement — has independently analysed and narrated this decades-long dispute, treating it as a governance case study.
That outcome alone speaks volumes.
windowsforum.com posting: AI Hallucinations and the Donovan Shell Archive: A Governance Challenge
windowsforum.com posting (With many links)
Over the course of a single news cycle in late December 2025, an old-fashioned, decades‑long dispute between a private citizen and one of the world’s largest energy companies was reframed not by a press release or a courtroom filing, but by a series of public interactions between generative AI assistants — and the reframing exposes important, practical weaknesses in how modern information systems handle contested corporate histories.
Background​
John Donovan’s campaign against Royal Dutch Shell is a long, well‑documented running conflict that began in the 1990s and later evolved into a curated network of archival websites, court documents, Subject Access Request (SAR) disclosures and commentary collected under domains such as royaldutchshellplc.com and its sister sites. That archive has periodically been cited in mainstream reporting and was the subject of a high‑profile domain dispute that reached the World Intellectual Property Organization (WIPO). The WIPO panel’s decision in Case No. D2005‑0538 is a public, verifiable record that the panel denied Shell’s complaint to seize certain domain names, a ruling that remains central to the dispute’s legal history.
For decades the Donovans — notably Alfred Donovan (the elder) and his son John — published material that the company found embarrassing, while Shell mostly chose legal containment and public silence as its strategy. The older generation of reporting documented the Donovans’ work and showed how a persistent, well‑indexed archive can seed mainstream coverage; The Guardian’s 2009 profile of the Donovans is an early independent account of their influence and methods.
What changed in December 2025 was not a new revelation from a court docket but a different mechanism of amplification: Donovan explicitly fed archival material into three large public AI assistants and published the results. Those assistants — identified in public posts and transcripts as Grok (xAI), Microsoft Copilot, and Google AI Mode — produced divergent summaries of the same contested record. One model produced a fabricated causal claim about a human death; another publicly corrected it; a third observed the resulting pattern. The public conversation that followed reframed an old quarrel as an AI governance problem and a corporate reputational risk.
Overview: What the bots said and why it matters​
Grok: vivid storytelling that crossed into reportage​
One assistant — widely reported as Grok — returned a fluent, emotionally resonant short biography that included an invented causal line: that Alfred Donovan had died “from the stresses of the feud.” That claim conflicted with the Donovans’ own publicly posted obituary material, which records Alfred Donovan’s death in July 2013 at age 96 after a short illness. The invented causal claim is a textbook example of hallucination — a model preferring narrative coherence and drama over evidence‑anchored hedging. The net result was an authoritative‑sounding but inaccurate statement about a private individual.
Why this matters: when a model invents precise, emotionally resonant facts about real people, it risks reputational damage and harms that human editors and legal teams must later remediate. The error was not merely stylistic; it put a false causal statement about a death into circulation under an apparent veneer of authority.
ChatGPT: correction as counter‑narrative​
When the same dossier was presented to another assistant — ChatGPT — the response rejected the invented cause‑of‑death line and corrected the record, explicitly noting the documented obituary and other published accounts. That contradiction — one model inventing a cause and another model debunking it — quickly became the headline of Donovan’s demonstration. The episode shows how cross‑model disagreements can surface factual errors that persisted in human‑editable repositories for years.
Microsoft Copilot: cautious synthesis with hedging​
Microsoft Copilot’s output was reported as a composed overview that included clear hedging language — disclaimers like “unverified narrative” — and structured summarisation of the archive. In practice, that meant Copilot presented a readable synthesis while signalling uncertainty, a behaviour that donors of contested material argued is more responsible in adversarial archive situations. The model’s conservative posture was notable because it produced more usable, audit‑ready prose than the corporate silence it implicitly criticised.
Google AI Mode: meta‑observation and pattern recognition​
Google AI Mode’s response focused on describing the pattern: that Donovan deliberately framed the dispute as a cross‑model experiment and that the result was a set of conflicting outputs. In other words, Google’s assistant stepped back and described the social process, not just the facts. This meta‑level framing is important because it highlights that AI systems can (and will) treat silence by major institutions as a meaningful signal in public discourse.
The evidence base: public records, archival claims, and gaps​
This story rests on three categories of material that must be weighed distinctly:
- Documented, court‑traceable artefacts: the WIPO UDRP decision in Case No. D2005‑0538 is a primary source that demonstrates one concrete legal episode in the domain fight. That panel’s published administrative decision is an objective anchor in the record.
- Mainstream press coverage and contemporaneous reporting: outlets such as The Guardian profiled the Donovans as early as 2009, noting their decades‑long campaign and the role the archive played in attracting leaks and reportage. This coverage establishes independent journalistic interest and corroborates parts of the Donovans’ public narrative.
- Archival self‑publication and reconstruction: royaldutchshellplc.com and affiliated domains host vast quantities of material assembled by John Donovan; these include scans, excerpts from legal filings, SAR outputs, and commentary. The archive is both the provocation and the evidence bank feeding the recent AI experiments. The site’s own posts explain the format and the intent behind the AI‑oriented experiments published on December 26–28, 2025.
Important caveat: the archive mixes primary documents (court filings, WIPO decisions, identifiable internal emails) with interpreted materials(anonymous tips, redacted memos, and commentary). While primary documents can be confirmed externally, unattributed or redacted items require additional independent corroboration before being treated as hard fact. Generative models will, absent explicit provenance metadata, often smooth that ambiguity into a single, coherent narrative — which is exactly where hallucination incentives sit.
A closer look at the Grok hallucination and the mechanics of model error​
Why did Grok invent a cause of death?​
Large language models optimize for fluent, coherent completions given an input context. When a dataset contains emotionally resonant fragments — litigation, family involvement, long conflicts — an LLM will sometimes prefer a tidy narrative arc because it increases the probability of the next token sequence. Absent explicit provenance attachments or conservative heuristics, the model substituted plausible drama for documented fact. The error is not rare; it follows from the architecture and training incentives of many high‑capability conversational agents.
Why did ChatGPT correct it?​
Different models use different training data, grounding strategies, and default safety/verification layers. Models that include stronger retrieval‑augmented components or that are tuned to favour hedging and source grounding will be likelier to flag unsupported claims. In this case, one assistant’s correction functioned as a public fact‑check, showing that multi‑model interrogation can expose model‑generated errors. That said, relying on “model A will correct model B” is not a substitute for primary‑source verification.
The danger of feedback loops​
When an AI‑produced narrative appears on public platforms, humans and other models may pick it up and amplify it. Donovan’s provocation highlighted this risk: an invented line, once produced, can be scraped, re‑indexed, and treated as corroboration by downstream agents. That feedback loop converts a single hallucination into a distributed falsehood unless active provenance and hedging practices are applied across the chain.
What actually changed on the public record?​
Donovan’s December 2025 posts claim that Wikipedia had carried incorrect life‑status information about Alfred Donovan for more than a decade and that the AI‑sparked controversy prompted an editorial correction. The sequence is plausible — social pressure, whether generated by human stories or algorithmic contradictions, often influences volunteer editors — but the precise causal chain from a Grok hallucination to an edit is not publicly provable from the materials available. Independent confirmation of an editor’s intent or the immediate trigger for the edit is absent in the public record, so the claim should be treated as plausible but unverified.
At the same time, the factual assertion about Alfred Donovan’s death (July 2013; age 96) is documented on the Donovans’ publicly maintained pages and is echoed in contemporary reporting; that specific biographical detail is well‑anchored even if the social dynamics around a subsequent Wikipedia edit are opaque.
Analysis: strengths, risks, and who bears responsibility​
Strengths revealed by the episode​
- Public archives can surface long‑tail evidence that would otherwise be dispersed across dockets and obscure filings; they can function as legible research banks for journalists and investigators.
- Cross‑model interrogation (asking multiple assistants the same query) can be an efficient way to surface contradictions and surface potential errors rapidly. The Grok/ChatGPT juxtaposition became a form of rapid triage.
- Modern assistants that implement strong hedging, provenance attachments, and clear disclaimers improve user judgment and reduce the risk of false positive dissemination. Copilot’s more conservative output demonstrates the value of explicit uncertainty language.
Systemic risks that emerged​
- Hallucination of sensitive facts about living (or recently deceased) people is not only technically possible; it is predictable under the current generation of coherence‑driven models. Without robust provenance and mandatory hedging for sensitive claims, these systems are likely to produce similar mistakes again.
- Archival mixes of verified documents and anonymous tips are particularly hazardous when served to retrieval‑light models. The line between an evidence‑based summary and a compelling synthetic narrative can become dangerously thin.
- Corporate silence — a common legal and PR posture — is not neutral in the age of AI. Silence cedes the narrative battlefield to adversarial archives and machine summarisation. The absence of a substantive corporate reply can make the company’s non‑participation itself a public signal and a reputational liability when models begin to narrate the story in ways that emphasize the gap.
Allocation of responsibility​
- Archive maintainers and campaigners have a duty to label and categorize what is primary, what is interpretive, and what remains unverified. Feeding ambiguous or anonymous material into public assistants without clear provenance invites error.
- Platform providers must implement conservative defaults for factual claims about identifiable people, especially claims involving causes of death, criminal conduct, or health. Mandatory provenance attachments and clear hedging should be enforced where evidence is thin.
- Journalists and researchers should treat AI‑generated summaries as starting points and require human verification before publishing or re‑amplifying claims surfaced by models.
Practical recommendations for stakeholders​
- For AI vendors
- Require document‑level provenance for any assertion about a living person that includes cause‑of‑death, criminal allegations, or medical claims.
- Default to conservative hedging for claims lacking primary‑source anchors and present users with inline citations or retrieval snippets.
- Preserve and surface prompt and retrieval logs for audit and redress.
- For journalists and editors
- Treat model outputs as leads rather than authoritative sources.
- Insist on primary documents (court filings, death certificates, reputable obituaries) before making or repeating sensitive claims.
- Archive tool outputs and prompts when an AI assisted a reporting decision for transparency.
- For companies and corporate counsel
- Silence is a strategy but also a signal; review disclosure and engagement policies for adversarial archives in the AI era.
- Maintain a rapid‑response editorial and legal coordination process that can correct demonstrably false claims without creating additional amplification.
- For campaigners and archive keepers
- Flag documents by provenance quality (court‑filed, SAR output with redaction metadata, anonymous tip).
- Avoid using evocative or speculative framing within archival headlines that may be ingested uncritically by models.
Legal and ethical guardrails to consider​
- Defamation and privacy risk: AI‑generated falsehoods about living persons can create legal exposure for both the platform publisher and downstream amplifiers. The risk is higher when models fabricate causation in deaths or crimes.
- Editorial duty of care: When archives feed into AI‑driven reportage, the maintainers of those archives share ethical responsibility to make provenance transparent and to correct demonstrable factual errors promptly.
- Platform transparency: Regulators and standards bodies should require provenance standards and audit logs for high‑impact conversational systems, especially those operating in public or semi‑public modes.
Why this story is broader than one website vs one company​
The Donovan–Shell episode is a compact, high‑visibility case study of several converging trends: archival persistence; adversarial use of the public record; model‑driven narrative smoothing; and corporate silence. The mechanics at play are not unique to this dispute. Any archival corpus that mixes primary documents with interpretive commentary can be converted into an algorithmic narrative, and that transformation can produce both clarifying syntheses and damaging hallucinations. The public lesson is practical: machines will amplify whatever humans make discoverable, and the default behaviours of those machines — coherence, fluency, and narrative completion — will often be at odds with evidentiary caution.
Conclusion: governance, not gadgetry​
The late‑December 2025 episode — a low‑cost experiment that asked multiple assistants the same questions and then published the responses — is both a provocation and a practical stress test. It revealed that modern assistants can perform useful, public‑interest summarisation and that they can invent harmful claims when given ambiguous inputs. The corrective moment — one model debunking another — is instructive but fragile: it depends on cross‑model diversity rather than principled provenance engineering.
The pragmatic path forward is neither technophobic nor complacent. It is governance‑centric: insist that AI outputs about real people carry provenance, hedging, and audit trails; require archival custodians to label their material; and ask companies to reconsider silence as an unexamined strategy in the age of machine summarisation. If boards, vendors, and publishers take those steps, the narrative power of AI can be steered toward truth amplification rather than drama‑driven distortion. Until then, the Donovan–Shell story will remain a cautionary case — not only about one man and one company, but about how the public record gets written when the machines begin to speak louder than the institutions they describe.
Source: Royal Dutch Shell Plc .com When the Bots Speak Louder Than Shell: ChatGPT Responds to Grok, Copilot, and Google AI Mode



EBOOK TITLE: “SIR HENRI DETERDING AND THE NAZI HISTORY OF ROYAL DUTCH SHELL” – AVAILABLE ON AMAZON
EBOOK TITLE: “JOHN DONOVAN, SHELL’S NIGHTMARE: MY EPIC FEUD WITH THE UNSCRUPULOUS OIL GIANT ROYAL DUTCH SHELL” – AVAILABLE ON AMAZON.
EBOOK TITLE: “TOXIC FACTS ABOUT SHELL REMOVED FROM WIKIPEDIA: HOW SHELL BECAME THE MOST HATED BRAND IN THE WORLD” – AVAILABLE ON AMAZON.



















