Millions of researchers could be affected by a “dramatic distortion of citation counts” likely caused by flaws in how the academic publishing giant Springer Nature handles article metadata, according to a new preprint.
The bug means a large number of citations are automatically attributed to the first paper in a given journal volume, instead of to whichever paper in that volume they were intended for. The issue appears to affect many of the publisher’s online-only titles, such as Nature Communications, Scientific Reports and several BMC journals.
“It seems that millions of scientists lost a few citations, while tens of thousands, the authors of Article 1s, gained all these, leading to insane citation counts,” Tamás Kriváchy of the Barcelona Institute of Science and Technology, in Spain, told us. His findings appeared earlier this month on arXiv.org. And those citation losses and gains are through no fault (or intention) of the authors themselves. In fact, one author we spoke with has tried, without success, to get mistaken citations removed from her paper.
A spokesperson for Springer Nature questioned the new data and said the preprint’s conclusions “could be misleading.”
According to the analysis, the distorted statistics appear on the journals’ own websites and in free citation databases such as Crossref, OpenCitations and Google Scholar. The problem could make it harder for scientists to find out which studies cite which and could give some authors unfair advantages in winning grants, promotions and jobs, Kriváchy said.
Whether the effects carry over to the two major commercial citation databases, which the researcher could not access, is unclear. But one expert told us they might.
”My analyses confirm that the study’s main concern is valid: citation-linking errors appear to be significant, systemic, and spill over even into curated databases such as Scopus and Web of Science,” said Lokman Meho, a bibliometrician at the American University of Beirut in Lebanon.
“If such mistakes are high, the implications could be profound,” Meho added. “Inflated citation counts distort measures of scholarly influence, misrank universities, mislead funding decisions, and compromise evidence-based science policy. They also challenge one of the field’s core assumptions: that curated citation databases are insulated from the problems encountered in open systems such as Crossref or OpenCitations.”
The preprint highlights a paper designated as “Article number 1” in the 2018 volume of Nature Communications, “Structural absorption by barbule microstructures of super black bird of paradise feathers.” The work has garnered more than 7,000 citations, according to the journal’s website, and Crossref, OpenCitations and Semantic Scholar provide comparable numbers. Meanwhile, Google Scholar lists 584 citing papers as of this writing, Clarivate’s Web of Science 582, and Scopus 1,323.
According to emails we have seen, the corresponding author of that paper, Dakota McCoy of the University of Chicago, contacted Nature Communications in April of this year, stating her paper was “frequently cited spuriously.” An editorial assistant supervisor for the journal replied: “I am afraid that we are unable to determine any steps we are able to take to resolve the issue on our side as it looks as if no errors occurred on the original publication of the paper.” They speculated the issue may have come from the citing journal, was a citation error that kept getting propagated, or was somehow being influenced by AI.
“This is a bizarre and annoying problem that we first noticed back in 2023 and haven’t been able to solve, despite emailing editors and Googling for hours. Even worse, the articles that are meant to receive those >400 citations aren’t receiving them!” McCoy told us by email. “It is unfortunate because it makes it difficult to track the true impact of our paper.”
“I’m so happy to see that this preprint may have identified the source issue,” she added.
McCoy’s coauthor Richard Prum, an ornithologist at Yale University, told us: “Many of the articles in Google Scholar that are counted as citations of us actually make no mention whatsoever of any research related to us or our paper! So, the problem is compounding!!”
Meho said he had confirmed the problems in an analysis of Scopus data for Prum and McCoy’s article, as well as three other papers in Nature Communications that also have more than 1,000 citations each, according to the database.
“When I extracted and examined the actual cited references in those citing papers, I found that fewer than 250 references in each case actually cited the target article. In other words, roughly three out of four citation links were erroneous, a discrepancy far too large to attribute to chance or isolated database glitches,” he told us. In at least one case, the errors did not seem to be explained by the bugs the preprint described, he said.
“The study also raises a larger question: Is this problem confined to Springer Nature, or is it an early warning of hidden structural vulnerabilities in how citation data are exchanged and standardized across all major publishers?” Meho said. “If millions of citation links can silently go astray in a system as central as Springer Nature’s, then research evaluation itself needs urgent scrutiny. The credibility of metrics, rankings, and even funding depends on the reliability of these invisible networks of data.”
According to Kriváchy, the problems appear to have originated with the advent of online-only journals several years ago. These publications typically reference articles using an article number instead of the page numbers traditional print journals use.
“Based on our analysis, the mis-citations happen primarily due to the above adaptation from a page-based numbering to an article number-based one; more specifically, from the improper technical handling of the change,” the preprint states. “The problem seems to stem from the absence of the Article Number in most formats of the article metadata obtained through the SpringerLink Application Programming Interface (API), or possibly from the handling of the fields in RIS file format provided by the publisher on Springer Nature Link websites.”
Springer Nature emphasized that, as a preprint, the new work “has not yet undergone peer review or independent validation.”
“Looking at the conclusions we suspect they could be misleading due to incomplete data,” a spokesperson said. “In the meantime, we are looking at all of the data ourselves as we are always open to feedback and to ensure that we continue to do the best for our authors.”
Kriváchy told us in addition to fixing the technical issues, the publisher “should put together a thorough report” addressing the cause of the problem as well as which journals have been affected and for how long.
Some of the damage won’t be fixable, however, said Alberto Baccini of the University of Siena, who studies publication metrics.
“The well-known Matthew Effect in bibliometrics indicates that highly cited papers become even more cited simply because they are perceived as important. Therefore, the initial metadata error has probably influenced researchers’ citation behavior, leading them to cite these ‘false’ highly cited papers precisely because of their high citation count,” Baccini told us. “After the data is corrected, how many of the remaining citations were received solely thanks to this mechanism? This is an unfixable problem.”
“We are all aware of the pollution infecting contemporary science and the mechanisms that have corrupted citation counts such as citation mills,” he added. “I hope that this ‘genuine’ error, originating from one of the major players in scientific publishing, will serve as a turning point. It should compel us to abandon our blind faith in quantitative metrics – a faith that has contributed so significantly to the corruption of contemporary science.”
Like Retraction Watch? You can make a tax-deductible contribution to support our work, follow us on X or Bluesky, like us on Facebook, follow us on LinkedIn, add us to your RSS reader, or subscribe to our daily digest. If you find a retraction that’s not in our database, you can let us know here. For comments or feedback, email us at [email protected].
By clicking submit, you agree to share your email address with the site owner and Mailchimp to receive marketing, updates, and other emails from the site owner. Use the unsubscribe link in those emails to opt out at any time.
Processing…
Success! You're on the list.
Whoops! There was an error and we couldn't process your subscription. Please reload the page and try again.
.png)

![The Most Controversial Idea in Biology [video]](https://www.youtube.com/img/desktop/supported_browsers/firefox.png)
![Visualisations explore what the deep future holds for our night sky [video]](https://images.aeonmedia.co/images/3b92d773-ca28-4622-b572-6a2fe7808605/the-universe-in-motion-landscape-2.jpg?width=1200&quality=75&format=auto)
