10% and Rising: Measuring ChatGPT's Quiet Influence on Research

A new study published on arXiv has uncovered the dramatic and unprecedented impact of large language models (LLMs) like ChatGPT on scientific writing. The research, conducted by a team from the University of Tübingen and Northwestern University, analyzed over 14 million biomedical abstracts from PubMed to track changes in academic writing styles before and after the release of ChatGPT.

Key Findings

At least 10% of scientific abstracts published in early 2024 were likely processed using LLMs, with the true number potentially much higher.
The impact varied widely across fields and countries, reaching up to 30% in some areas like computational biology.
LLM usage was detected through an increase in certain style words and phrases favored by AI models.
The scale of this change surpassed even the dramatic shift in scientific vocabulary seen during the COVID-19 pandemic.

Methodology

The researchers developed a novel “excess word usage” approach, inspired by excess mortality studies during the pandemic. By comparing word frequencies before and after ChatGPT’s release, they identified words and phrases that showed abnormal increases in usage – a linguistic fingerprint of LLM involvement.

Implications

While LLMs can improve readability and help non-native English speakers, the study raises concerns about potential negative impacts:

Homogenization of scientific writing styles
Propagation of biases present in LLM training data
Risk of factual errors or hallucinated content slipping into papers
Potential misuse by paper mills to generate fake research

Lead author Dmitry Kobak commented: “Our work shows that the effect of LLM usage on scientific writing is truly unprecedented and outshines even the drastic changes induced by the Covid-19 pandemic.”

The Future of Academic Publishing:
This study provides hard data on a trend many have suspected – AI is rapidly transforming how scientific research is communicated. It highlights the urgent need for clear policies and guidelines around LLM use in academia. As these tools become more powerful and widespread, maintaining the integrity and diversity of scientific discourse will be a key challenge for the research community.

The authors suggest their methodology could be applied to track LLM usage in other domains like journalism, grant applications, and even creative writing. As AI continues to reshape how we create and consume written content, studies like this will be crucial for understanding its true impact on human communication and knowledge production.

10% and Rising: Measuring ChatGPT’s Quiet Influence on Research

Key Findings

Methodology

Implications

Related

Leave a ReplyCancel reply

Nvidia Bets Big on Inference With a $20 Billion Groq Grab

When and Why We Turn to Copilot

Making Claude Code Usage Observable

When GPT-5 Steps Into the Lab

Anthropic’s Claude Opus 4.5: AI with Unmatched Efficiency and Safety

The Hidden Human Costs Behind Today’s AI

Gmail’s Quiet AI Opt-In Sparks Fresh Privacy Concerns

AI Caught in the Act: Inside the First Autonomous Cyber-Espionage Operation

A Malware That Uses AI To Rewrite Itself