Study Reveals AI Degradation Due to Social Media

ForkLog

5 months ago

Study Reveals AI Degradation Due to Social Media

Poor-quality content leads to the degradation of LLM. This conclusion was reached by researchers from the University of Texas and Purdue University.

The researchers fed four popular AI models a selection of viral posts from X over one month and noted the following changes:

a 23% decrease in reasoning ability;
a 30% decline in long-term memory;
increased levels of narcissism and psychopathy according to personality tests.

The effect intensified in proportion to the volume of low-quality data. Notably, even after retraining on clean and high-quality content, it was not possible to completely eliminate cognitive distortions.

How was the study conducted?

In the experiment, the authors proposed and tested the “brain rot hypothesis” of AI models. It asserts that continuous exposure to “junk” information leads to the persistent degradation of large language models.

To identify low-grade content, the researchers created two metrics:

M1 (degree of engagement) — posts designed to attract attention (usually short, viral, with a large number of likes and reposts);
M2 (semantic quality) — posts marked as having low informational value or containing exaggerated claims.

Maintaining a consistent number of tokens and training operations, the results showed that compared to the control group, continuous retraining of four LLMs on a low-quality dataset led to a deterioration in reasoning, long text comprehension, and safety metrics.

Gradual mixing of the “junk” set with the control also caused a decline in cognitive abilities. For instance, with M1, as the proportion of poor-quality data increased from 0% to 100%, the result on the ARC-Challenge fell from 74.9 to 57.2, and on RULER-CWE — from 84.4 to 52.3.

The models also experienced a decline in ethical consistency. Researchers noted that AI exposed to low-quality data became less reliable and more confident in incorrect answers.

LLMs began skipping logical steps in reasoning, providing superficial results instead of detailed explanations.

What can be done?

Researchers urged AI developers to systematically monitor the cognitive health of models and recommended three key steps:

implement regular assessments for deployed systems to detect early signs of declining reasoning quality;
tighten data quality control during pre-training by using stronger filters;
study how viral content changes AI learning patterns to design models resilient to it.

The researchers stated that measures are necessary to prevent significant damage — currently, models continue to learn from internet data. Without appropriate control, AI risks inheriting distortions from generative content, initiating a cycle of degradation.

Earlier, NewsGuard experts identified a tendency of OpenAI’s Sora 2 to create deepfakes.