The Internet Is Eating Itself

AI isn’t an intelligence but a giant stomach. Agree or disagree?

__________

Stephen Klein

THE INTERNET IS EATING ITSELF
The Internet Is Now 52% AI-Generated Content
And It’s Training On Itself

New research just dropped numbers that should terrify anyone who cares about truth:

**52% of all internet content is now AI-generated.**¹
In 2022, it was 10%.

But here’s where it gets insane (actually it’s all insane TBH):

**74% of ALL NEW web pages contain AI-generated content.**²
The internet added 3-5 billion pages monthly in 2024
most of it synthetic.³

The internet isn’t just being eaten.
It’s being mass-produced by the thing that’s eating it.

Why This Matters

Large Language Models aren’t brains.
They’re giant stomachs.
They consume everything.
Digest nothing.
Excrete more content, which gets consumed again
an infinite feedback loop of synthetic regurgitation.

Here’s what happens when AI trains on AI:

→ Model collapse: Recursive training causes “irreversible performance decay.”⁴
→ Narrowing of knowledge: Models reflect themselves, not reality
→ Death of originality: A hall of mirrors, each reflection dimmer than the last

We’re replacing:

Human nuance
Cultural context
Real expertise
Original thought
The truth

With statistically probable simulations.

The Economics

Licensing real content costs billions.
Synthetic data? Almost nothing.
So they choose cheap scale over real knowledge.
No regulation. No transparency. No tracking.

OpenAI doesn’t ask permission to train on Anthropic’s outputs.
They just scrape the web.

Competition accelerates the collapse.
Every AI company races to build bigger models.

They need more data.
Synthetic data looks like a shortcut.
Collectively, they’re destroying the foundation their business depends on: real human knowledge.

We’ve Already Passed the Tipping Point

What happens when:

→ Medical information trains on synthetic medical papers?
→ Children learn history from recursive AI summaries?
→ Scientific research builds on fabricated datasets?

We don’t just lose quality.
We lose the ability to know what’s real.
The internet was humanity’s collective memory.
Now it’s becoming humanity’s collective hallucination.

The Bottom Line

LLMs are giant stomachs, not brains.
They consume.
They excrete.
They consume again.
********************************************************************************
The trick with technology is to avoid spreading darkness at the speed of light
Stephen Klein | Founder & CEO, Curiouser.AI | Teaches AI Ethics at UC Berkeley | Raising on WeFunder | Would your support (LINK IN COMMENTS)

Footnotes:

¹ Graphite Research (2025). Analysis of 65,000 URLs from Common Crawl.
² Ahrefs (2025). Analysis of 900,000 newly created web pages in April 2025.
³ Common Crawl Foundation. Database adds 3-5 billion pages monthly.
⁴ Shumailov et al. (2024). “AI models collapse when trained on recursively generated data.” Nature, 631, 755-759.

See post on LinkedIn

Related