Do You Know? The Crisis of "Model Collapse" When AI Continues to Consume Information

公開日: 2025-04-09

What happens when one AI learns from information generated by another AI? This article explains the frightening future of 'model collapse' where knowledge becomes increasingly contaminated.

When AI Keeps Consuming Information… Something Terrible Happens!

Let’s imagine something for a moment.

An AI reads your delicious curry recipe, slightly modifies it, and publishes it as “AI’s Curry Recipe.” Then another AI reads that, changes it further, and publishes it again… What would the recipe look like after this process repeats 100 times?

Probably, it would become something with a strange taste, completely different from your original curry.

This same phenomenon is currently happening with all kinds of information on the internet. Researchers call this “model collapse.”

What is “Model Collapse”? A Simple Explanation

To explain model collapse in the simplest terms:

“It’s the phenomenon where information degrades progressively as AI creates information, another AI learns from it, that AI creates more information, yet another AI learns from that… and so on in a continuous cycle.”

Why is this a problem? Because information created by AI contains “errors” and “biases.” When this process repeats, these errors and biases snowball and become increasingly significant.

An Easy-to-Understand Example: “All Cats Become Yellow”

A research study presented this example:

Suppose an AI is trained on photos of 90 yellow cats and 10 blue cats. This AI learns that “cats are primarily yellow.”

When this AI tries to draw a “blue cat,” it will likely draw a “bluish-yellow cat” rather than a purely blue one.

When the next AI learns from this “bluish-yellow cat” image, it learns that “blue cats are somewhat yellow.”

As this process repeats many times… eventually the concept of “blue cats” disappears, and all cats end up being depicted as yellow!

The Terror of AI Falsehoods Becoming “Facts”

AI suffers from a problem called “hallucination,” where it generates information that isn’t factual.

For example, if an AI writes about a non-existent event like “Mount Everest erupted in 2023” and this gets published online, the next AI that learns from this will treat the “Everest eruption as fact.”

Even with the latest AI models, the probability of generating such misinformation is not completely zero. Even OpenAI’s latest model reportedly has a 1.4% “hallucination rate.”

This may seem like a small number, but imagine if 1.4% of the vast information on the internet becomes misinformation and continues to spread… It’s frightening, isn’t it?

This Isn’t Science Fiction! It’s Already Happening

This problem is already occurring around us:

  • When you search something on Google, thin articles written by AI often appear at the top
  • During the 2024 Noto Peninsula earthquake, fake rescue requests with non-existent addresses spread on social media
  • Cases of AI-generated “non-existent papers” being cited in academic papers are increasing

When the internet becomes filled with “information by AI for AI,” truth and facts become increasingly difficult for us humans to discern.

Are AI Companies Aware of This Problem?

Major AI companies like OpenAI, Google, and Anthropic are aware of this issue. They are working to reduce AI “hallucinations,” but a fundamental solution has not yet been found.

This is because it’s not simply an issue of AI accuracy, but a problem involving the entire ecosystem of information on the internet.
The fundamental issue is the quality of information sources that AI uses to grow.
AI models cannot become better than the quality of their training data (the principle of “garbage in, garbage out”).
AI companies alone cannot solve this problem.

What We Can Do → Develop “Immunity” to Information

There are things each of us can do:

  • Develop a habit of evaluating the quality and reliability of information (regardless of the source type)
  • Support people who create high-quality original content
  • Expose ourselves to diverse information sources and cultivate critical thinking
  • Not just consume information, but also produce high-quality information ourselves

Information literacy will likely become the most important skill for living in the coming era.