The Paper Was Published

In 2023, over ten thousand papers were retracted from the scientific literature. The system that caught them is the same system that published them.

Cedric Atkinson

A headline appears in your newsfeed. The headline cites a study. The study was published in a peer-reviewed journal. The journal charged the researcher a fee to publish it. The researcher needed the publication to keep a job. The journal needed the fee to stay in business. The university needed the publication count to climb a ranking. The ranking needed the journal to validate the count.

Nobody in the chain was optimizing for whether the finding was true.

The fee, if you are curious, starts at roughly two thousand dollars. Nature, the most prestigious scientific journal in the world, charges $12,850 to publish a single open-access paper.1 The researcher pays. The journal publishes. The university counts. The ranking updates.

The finding may or may not be real. The process does not check.

If the researcher cannot afford $12,850, there are alternatives. A paper mill in China or Russia will sell first authorship on a fabricated paper for as little as twenty dollars.2 For a few hundred, the paper comes with a guarantee: acceptance in a journal with an editor or reviewer on the mill's payroll. The author's name appears on a study the author did not write, analyzing data the author did not collect, reaching conclusions nobody tested.

The price, at the low end, is $395.

The factory floor

In 2021, three researchers at French universities noticed something strange in the scientific literature. Guillaume Cabanac and Cyril Labbé, working with Alexander Magazinov, found papers in established journals containing phrases no scientist would write. "Antidotal" in place of antiviral. "Sham pondering" in place of artificial intelligence. "Profound learning" in place of deep learning.3

The phrases were not errors. They were artifacts of paraphrasing software designed to evade plagiarism detection. The software took existing papers, ran them through a synonym generator, and produced new text that passed automated screening. The technical terms, treated as ordinary words, became nonsense. The nonsense was published.

Cabanac built a tool called the Problematic Paper Screener. It filters 130 million scholarly papers each week for nine types of textual abnormality. It has been instrumental in more than one thousand retractions.4

The retractions are a fraction of the output. In 2026, a team led by Jennifer Byrne published the first large-scale estimate of paper mill contamination in a single field. They screened 2.6 million cancer research papers published between 1999 and 2024. Their machine learning model, with 91% accuracy, flagged 9.87% of them as probable paper mill products.5

One in ten cancer papers.

For specific cancer types, the figures were higher. Gastric cancer: 22%. Bone cancer: 21%. Liver cancer: 20%.6

Readers found the contamination. The system that published it is still counting.

The papers are not random noise. They use real data. After the release of ChatGPT in November 2022, a specific pattern emerged. Researchers, or the software working on their behalf, began downloading public health datasets and running automated analyses. The National Health and Nutrition Examination Survey, a database maintained by the U.S. Centers for Disease Control, became the raw material. Pick any two variables. Run an automated statistical test. Produce a finding that looks real because the data is real. The science is not.

Matt Spick, an associate editor at Scientific Reports, documented the explosion. Before ChatGPT, an average of four papers per year analyzed single-factor relationships in NHANES data. In 2024, the number was 190. The association between oxidative balance score and chronic kidney disease was published six separate times by six different groups.7

"I was getting so many nearly identical papers," Spick said. "One a day, sometimes even two a day."8

The retraction count traces the arc. In 2002, 119 papers were retracted from the scientific literature worldwide. In 2010, when Retraction Watch began tracking, the number was roughly 400. In 2022, it passed 4,600. In 2023, it crossed ten thousand. The database now holds more than 63,000 entries.9

One researcher alone accounts for 236 of them. Joachim Boldt, a German anesthesiologist formerly at Ludwigshafen Hospital, fabricated data on fluid management during surgery for years. His work influenced treatment decisions for real patients.10

Steven Zielske, a researcher at Wayne State University, applied for a grant to study a molecule called SNHG1, which had been linked to prostate cancer. He was denied. A reviewer wrote that the field was "crowded." Zielske went back and read the roughly 150 papers on SNHG1 and cancer, nearly all from Chinese hospitals. He concluded that a majority looked fake.11

The following year, he explained in his application that most of the literature probably came from paper mills. He received the grant.

"You can't just read an abstract and have any faith in it," Zielske said. "I kind of assume everything's wrong."12

The industry that produces this output is estimated to be worth hundreds of millions of dollars annually. One documented operation, International Publisher LLC in Russia, collected $6.5 million from co-authorship slots between 2019 and 2021 alone.13 The journals that publish the output collected $2.538 billion in article processing charges in 2023. MDPI, a single open-access publisher, took in $681.6 million.14

The demand funds the supply. The metric sits between them, counting.

The penthouse

The paper mill operates at the industrial end. At the other end, the currency is not cash. It is a career.

In June 2023, a group of three researchers who run a blog called Data Colada published a four-part series titled "Data Falsificada." The target was Francesca Gino, one of the most cited behavioral scientists in the world and the Tandon Family Professor of Business Administration at Harvard Business School.15

Her research area was dishonesty.

Data Colada identified fabricated data in at least four of Gino's published papers. The evidence included statistical anomalies, impossible distributions, and patterns consistent with manual alteration of Excel spreadsheets. They had sent their findings to Harvard privately in 2021. Harvard conducted its own investigation and produced a 1,300-page report concluding that Gino "committed research misconduct intentionally, knowingly, or recklessly."16

Harvard placed Gino on unpaid administrative leave, stripped her named professorship, and barred her from campus. In May 2025, Harvard revoked her tenure. It was the first tenure revocation at the university since the 1940s.17

Gino filed a $25 million defamation lawsuit against Harvard and Data Colada. Harvard counter-sued in August 2025, alleging she had submitted a falsified dataset in her own defense.18

The researcher who built her career studying the gap between what people say and what people do was demonstrating the gap.

One of the four papers Data Colada flagged was co-authored with Dan Ariely, a professor at Duke and the author of "Predictably Irrational," a New York Times bestseller on the psychology of dishonesty. The study claimed that signing an honesty declaration at the top of a form reduced dishonest behavior. The insurance company that provided the underlying data confirmed that the data in the published paper did not match what it had provided. File metadata showed Ariely was the creator and last modifier of the spreadsheet. The company's data covered roughly six thousand vehicles. The published study reported over thirteen thousand.19

Two dishonesty researchers. One shared paper. Both fabricated.

The same month Data Colada's series went live, Marc Tessier-Lavigne resigned as president of Stanford University. The Stanford Daily, the student newspaper, had reported in November 2022 that images in his neuroscience papers showed signs of manipulation. A Stanford Board investigation found manipulated data in five papers spanning a decade. Elisabeth Bik, an image integrity specialist, had flagged problems on PubPeer as early as 2015.20

The investigation concluded Tessier-Lavigne had not personally manipulated the images. It also concluded that he had failed to correct or retract the papers once problems were flagged. He agreed to retract three and correct two.

The institutional brand, Harvard, Stanford, was doing the same work as "peer-reviewed." The signal was the product.21

Andrew Wakefield published a twelve-author paper in The Lancet on February 28, 1998, claiming a link between the MMR vaccine and autism in a case series of twelve children. The Lancet retracted the paper on February 2, 2010. Twelve years.22

In the interval, MMR vaccination rates in the United Kingdom dropped from 92% to 80% nationally. In parts of London, they fell to 58%. The threshold for herd immunity is 95%. Measles outbreaks followed. In 2013, more than 1,200 cases were reported in Swansea, Wales alone. One person died. Emergency distribution of 50,000 vaccines was required.23

An investigation by the journalist Brian Deer, published in the BMJ in 2011, revealed that Wakefield had been hired by a lawyer at £150 per hour, plus expenses, to manufacture evidence for a lawsuit against MMR vaccine manufacturers. The arrangement began two years before the 1998 paper. Wakefield received more than £435,000 in fees and expenses. The BMJ declared the study "an elaborate fraud."24

Wakefield was struck off the UK Medical Register in May 2010. He moved to the United States and directed an anti-vaccine film.

The most consequential fraud in the dataset did not involve vaccines. In March 2006, Nature published a paper by Sylvain Lesné, a researcher at the University of Minnesota, identifying a specific protein called Aβ*56 that appeared to cause memory impairment in mice. The paper provided what seemed to be strong experimental evidence for the amyloid cascade hypothesis, the dominant theory guiding Alzheimer's research and drug development for two decades.25

In July 2022, investigative journalist Charles Piller published a six-month investigation in Science. Matthew Schrag, a neuroscientist at Vanderbilt, had discovered suspicious images on PubPeer. Western blot images in twenty of Lesné's papers appeared to have been spliced, duplicated, and digitally altered to fabricate or enhance the presence of the protein.26

Nature retracted the paper in June 2024. It had been cited approximately 2,500 times. It is the second most cited retraction in history.27

The National Institutes of Health spent roughly $1.6 billion on amyloid-related Alzheimer's research in fiscal year 2022, approximately half of all federal Alzheimer's funding. An estimated $42 billion was spent on more than one thousand Alzheimer's clinical trials between 1995 and 2021, the majority targeting amyloid. The failure rate of amyloid-targeting drugs was near 100%. Pfizer shut down its Alzheimer's drug-discovery program in 2018.28

One manipulated paper. Eighteen years. $42 billion in clinical trials directed at a mechanism built on fabricated evidence.

Researchers who challenged the amyloid orthodoxy during those eighteen years were marginalized in funding, publishing, and tenure decisions. The field has been estimated to be fifteen to thirty years behind where it could have been.29

The metric

The paper mill is the pathological version. The legitimate system is the chronic version. The difference is one of degree, not kind.

In August 2015, the Open Science Collaboration published the largest systematic attempt to replicate published research in history. Two hundred and seventy researchers repeated one hundred psychology experiments that had been published in three leading journals. In the original publications, 97% of the studies had found statistically significant results. In the replications, 36% did.30

Sixty-four percent of published findings, from the field's top journals, did not hold up when someone ran the experiment again.

The effect sizes, when studies did replicate, were half the magnitude of the originals.31

Psychology was not an outlier. When Amgen scientists tried to replicate 53 "landmark" cancer biology studies, 6 succeeded. An 89% failure rate.32 Bayer attempted 67 preclinical studies in oncology, cardiovascular, and women's health. Only 20 to 25% fully reproduced.33 The Reproducibility Project: Cancer Biology planned to replicate 193 experiments from 53 high-impact papers. They completed 50. Not because the replications failed, but because the original methods were insufficiently described, the authors did not respond to requests for materials, or the reagents were no longer available.34

Of the 50 they completed, effect sizes were smaller than the originals in 92% of cases. Eighty-five percent smaller on average.35

Nobody fabricated these studies. They were conducted by researchers at leading institutions, reviewed by peers, and published in respected journals. The replication failures came from the system working exactly as intended.

The design works like this. A researcher runs an experiment. If the result is positive and statistically significant, it can be published. If the result is null, it usually cannot. Robert Rosenthal named this the "file drawer problem" in 1979. The null results go in the drawer.36

The consequences are measurable. In a study that tracked 221 research projects funded by the National Science Foundation, strong results were 40 percentage points more likely to be published than null results. Only 21% of null findings made it to a journal. Sixty-two percent of strong findings did.37

96% positive results
standard reports
44% positive results
Registered Reports
Scheel, Schijen & Lakens, Advances in Methods and Practices in Psychological Science, 2021. 71 Registered Reports vs. 152 standard reports.

In 2021, a team of researchers compared 71 Registered Reports, where the study design is reviewed and accepted before data collection, with 152 standard reports in psychology. Standard reports showed positive results 96% of the time. Registered Reports showed positive results 44% of the time.38

The 52-point gap measures how much the standard system suppresses findings that do not confirm the hypothesis.

The paper mill sells fabricated results. The legitimate system buries real ones. The output is the same: a literature that overstates what is true.

The positive result rate has been climbing. Across 4,600 papers in all disciplines from 1990 to 2007, the share reporting positive findings grew from 70% to 86%.39 Not because science was getting better at finding things. Because the filter was getting tighter.

In 2005, John Ioannidis, a Stanford epidemiologist, published a paper titled "Why Most Published Research Findings Are False." He demonstrated mathematically that for most common research configurations, the probability that a published finding is false exceeds 50%. The paper has been cited more than ten thousand times. It is the most accessed article in the history of PLOS.40

The title was not an accusation. It was a calculation.

The chain

Trace the chain forward from the published paper. A researcher produces a finding. A journal publishes it. Another researcher cites it. A systematic review collects it alongside fifty similar papers. A meta-analysis pools the data. A clinical guideline committee reads the meta-analysis. A guideline is updated. A doctor follows the guideline. A patient receives the recommendation.

The finding accumulates authority at every step. Nobody re-runs the experiment.

The doctor in the room with the patient did not read the original paper.41 The guideline committee that produced the recommendation may not have checked whether the underlying studies replicated. The meta-analysis included the published results, because the unpublished null results were in file drawers. The systematic review included papers from journals whose contamination rate in some fields is one in five.

The patient hears two words: "Studies show."

The two words point backward through the chain. They point to a meta-analysis. The meta-analysis points to twenty studies, of which the null results were filtered out before publication. The published studies point to journals whose business model charges the researcher to appear. The researcher points to a tenure committee that counted the publications. The tenure committee points to a metric that measures volume.

The recommendation arrives in the examination room with the weight of the entire chain behind it. The chain is real. The links are documented. Whether the finding at the origin is true is a separate question, and the chain does not carry the answer.

The inherited belief

"Peer-reviewed and published" was a reasonable proxy for quality when it was formed. When the phrase entered common use, the barriers to publication were high enough to make the label a genuine signal.

Journals were printed on paper. Press runs were expensive. Page counts were limited. Editors rejected the vast majority of submissions because there was no room for them. The cost of publishing imposed a selection pressure that was not designed to filter for quality but achieved it as a side effect. The scarcity of the medium did the work the label now claims to do.

The conditions changed. Open-access publishing removed the scarcity. Article processing charges created a revenue incentive to publish more, not less. Digital distribution made page counts unlimited. The journal that once rejected 95% of submissions could now accept every paying customer.

The label stayed. "Peer-reviewed and published" still points to the old system in the reader's mind. The reader who hears it imagines the barrier. The barrier is lower than it has ever been. In some corners, it is for sale.42

The belief was correct when it was given. The conditions that made it correct were removed. Nobody went back to check.

The mortgage advice was correct when price-to-income was three to four times annual earnings.43 The school rating measured income, not quality.44 The brand measured recognition, not the relationship.45 The credential measured completion, not learning.46 Each inherited belief survived the conditions that made it true.

"Peer-reviewed and published" is the version that applies to the system that produces knowledge itself.

The credential

The paper mill sells the signal of research. The degree mill sells the same signal for education.

Axact, a company based in Karachi, operated at least 370 fake university websites. Between 2009 and 2015, it sold fake degrees to more than 200,000 people in 197 countries. A New York Times investigation in 2015 found that the operation collected at least $140 million. Staff numbering roughly two thousand, some posing as American educational officials, operated around the clock.47

In 2004, the U.S. Government Accountability Office reported that 463 federal employees had obtained degrees from three unaccredited schools examined in the investigation. Twenty-eight senior officials at eight agencies listed those degrees on personnel records. Three employees at the National Nuclear Security Administration held diploma mill degrees and had security clearances.48

In 2023, the FBI uncovered a scheme that had produced more than 7,600 fake nursing diplomas from three schools in South Florida. Buyers paid roughly $15,000 each. They used the diplomas to obtain state nursing licenses. Twenty-five defendants were charged. The FBI warned that over 7,600 people with fraudulent credentials were "potentially in critical healthcare roles treating patients."49

The degree mill is the paper mill with the product reversed. One sells evidence of research that did not happen. The other, of learning.

The customer needs the credential the way the researcher needs the publication. The degree mill charges $15,000 for a nursing diploma. The paper mill charges $14,800 for co-first authorship on a cancer paper. The prices converge because the products are the same: a label that the system recognizes, attached to a process that did not occur.

The counter-case

A civil engineer publishes a paper on bridge load tolerances. The bridge is built. The bridge holds or it does not. The published finding meets a test that has nothing to do with peer review.

In particle physics, two independent detectors at the Large Hadron Collider, ATLAS and CMS, had to observe the Higgs boson separately before the discovery was announced.50 The replication preceded the publication.

Both fields run on the same publication metrics. Researchers need papers. Journals charge fees. Tenure committees count. The incentive structure is identical. The output is different.

The difference is an external test.21 The bridge collapses. The detector contradicts. The published finding meets something outside the system that published it.

Psychology, cancer biology, and social science have no bridge. The finding is checked by other findings, reviewed by other reviewers, cited by other papers. The system validates itself. The replication crisis emerged in the fields where this loop was tightest.

Registered Reports work because they introduce an artificial version of that test. Pre-committing to the study design before data collection forces the process to be visible before the outcome is known. The journal agrees to publish regardless of the result. The positive result rate drops from 96% to 44%.51 The 44% is what a healthy literature looks like. The 96% was the distortion.

More than 300 journals now accept Registered Reports, up from 11 in 2015.52 ClinicalTrials.gov holds more than 530,000 registered studies from 226 countries.53 The National Institutes of Health, as of January 2023, requires funded researchers to share their data by publication.54

Retractions, the number that seems most alarming, are where the external test is arriving. Ten thousand retractions in 2023 are ten thousand corrections. Zero retractions means nobody checked.55

Richard Horton, editor of The Lancet, wrote in 2015: "Much of the scientific literature, perhaps half, may simply be untrue."56 He was describing the cost of a system with no bridge.

This piece relies on footnotes. The footnotes point to published papers, government reports, investigative journalism, and databases. They point to the same system the piece just traced.

The data survives not because the label "peer-reviewed" protects it. The retraction data comes from Retraction Watch, which is independently verifiable. The replication data comes from experiments that were re-run and reported in full. The fraud cases come from investigations whose evidence, the manipulated images, the altered spreadsheets, the mismatched datasets, is publicly documented. The Alzheimer's funding figures come from NIH budget records. The degree mill convictions come from court proceedings.

The trust is not in "published." The trust is in "checkable."

A headline will appear in your newsfeed tomorrow. It will cite a study. The study will have been published in a peer-reviewed journal. The journal will have charged the fee. The researcher will have needed the line. The university will have counted it.

The finding may or may not be real. The process does not check. Someone has to.

Sources

  1. Nature's article processing charge for open access as of 2025: $12,850. Source: Nature portfolio pricing. Median gold OA APC across journals: ~$2,000 (Delta Think Market Sizing, 2025).
  2. Nature investigation by Christine Ro and Jack Leeming, June 9, 2025, documented a range of $20–$700 for adding names to accepted papers. Russian site 123mi.ru offered slots from several hundred to $5,000 (Science, April 6, 2022, based on Anna Abalkina research).
  3. Guillaume Cabanac, Cyril Labbé, and Alexander Magazinov, "Tortured phrases: A dubious writing style emerging in science," arXiv:2107.06751, July 12, 2021.
  4. Cabanac's Problematic Paper Screener filters 130 million papers weekly. Source: CNRS News profile; The Conversation, January 2025.
  5. Baptiste Scancar, Jennifer A. Byrne, David Causeur, and Adrian G. Barnett, "Machine learning based screening of potential paper mill publications in cancer research: methodological and cross sectional study," BMJ, 2026; 392:e087581.
  6. Gastric cancer: 22% flagged. Bone cancers/osteosarcoma: 21%. Liver cancer: 20%. Source: same BMJ study.
  7. Matt Spick et al., "Explosion of formulaic research articles, including inappropriate study designs and false discoveries, based on the NHANES US national health database," PLOS Biology, May 8, 2025.
  8. Matt Spick, quoted in Science, May 14, 2025.
  9. Retraction Watch database and year-end reviews (2022, 2023, 2024). 2002 count: 119 (RW year-end, Dec 27, 2022). 2023 count: over 10,000 (Nature, Dec 12, 2023). Database total as of March 2026: over 63,000.
  10. Joachim Boldt: 236 retractions. Retraction Watch Leaderboard. Runner-up: Yoshitaka Fujii, 172 retractions.
  11. Steven Zielske, Wayne State University, and Frank Cackowski, Karmanos Cancer Institute. Source: The Conversation, January 29, 2025, six-month investigation by Ivan Oransky, Guillaume Cabanac, and Cyril Labbé.
  12. Zielske, quoted in same investigation.
  13. International Publisher LLC revenue: Anna Abalkina, cited in Nature, June 9, 2025. Industry estimated at "hundreds of millions of dollars a year" (conservative, Abalkina).
  14. Global APC revenue, 2023, six major publishers: $2.538 billion. MDPI: $681.6 million. Source: OPUS Project/arXiv study, July 2024.
  15. Data Colada, "Data Falsificada" series, Parts 1–4, June 17–30, 2023. Findings first sent to Harvard in 2021.
  16. Harvard investigation: 1,300-page report. Quote: "committed research misconduct intentionally, knowingly, or recklessly." Source: Science; Harvard Crimson.
  17. Harvard revoked Gino's tenure May 27, 2025. First tenure revocation since the 1940s. Source: Harvard Crimson; NBC News.
  18. Gino filed $25 million lawsuit August 2, 2023. Defamation claims against Data Colada dismissed. Harvard counter-sued August 18, 2025. Source: Harvard Crimson.
  19. Dan Ariely and the 2012 PNAS "signing at the top" paper. File metadata, Hartford insurance data discrepancy. Source: Data Colada Post #98, August 2021; BuzzFeed News; Science.
  20. Marc Tessier-Lavigne: Stanford Daily investigation, November 29, 2022. Resignation July 19, 2023. Stanford Board investigation: five papers with manipulated data, 1999–2009. Elisabeth Bik flagged issues on PubPeer as early as 2015. Source: Stanford Daily; Science; NPR.
  21. The authentication process in academic publishing is internal: peers evaluate peers. There is no external test equivalent to the bridge that collapses or the company that goes bankrupt. See Thomas Sowell, "Intellectuals and Society" (2009) and "Knowledge and Decisions" (1980).
  22. Andrew Wakefield et al., The Lancet, February 28, 1998. Retracted February 2, 2010.
  23. UK MMR uptake: 92% (1995–96) to 80% (2003), 58% in parts of London. Swansea outbreak 2013: 1,200+ cases, one death. Sources: PMC; NPR; CIDRAP.
  24. Brian Deer investigation, BMJ, January 2011. £150/hour arrangement with lawyer Richard Barr, beginning two years before the 1998 paper. BMJ editorial declared the study "an elaborate fraud."
  25. Sylvain Lesné et al., "A specific amyloid-beta protein assembly in the brain impairs memory," Nature, March 16, 2006.
  26. Charles Piller, investigative report, Science, July 21, 2022. Matthew Schrag (Vanderbilt) discovered image manipulation. Twenty Lesné papers flagged.
  27. Nature retracted the Lesné paper June 24, 2024. ~2,500 citations. Second most cited retraction in history. Lesné resigned from University of Minnesota effective March 1, 2025.
  28. NIH amyloid spending: ~$1.6 billion in FY2022. $42 billion estimated for 1,000+ Alzheimer's clinical trials (majority amyloid-targeting), 1995–2021. Near 100% failure rate. Pfizer closed Alzheimer's drug discovery 2018. Sources: Science; STAT; Scientific American.
  29. Researchers challenging the amyloid orthodoxy were marginalized. Estimated 15–30 years behind. Sources: STAT (June 25, 2019); Scientific American.
  30. Open Science Collaboration, "Estimating the Reproducibility of Psychological Science," Science, Vol. 349, August 28, 2015. 100 studies. 36% replicated.
  31. Same study. 97% of originals reported significant results. 36% of replications did. Effect sizes half the originals.
  32. C. Glenn Begley and Lee M. Ellis, "Raise Standards for Preclinical Cancer Research," Nature, March 2012. Amgen: 6 of 53 replicated (11%).
  33. Florian Prinz, Thomas Schlange, and Khusru Asadullah, "Believe it or not," Nature Reviews Drug Discovery, 2011. Bayer: 20–25% fully reproduced.
  34. Reproducibility Project: Cancer Biology, eLife, December 2021. Planned 193 experiments from 53 papers. Completed 50 from 23.
  35. Of 50 completed experiments: effect sizes smaller than originals in 92% of cases, 85% smaller on average. Source: same eLife collection.
  36. Robert Rosenthal, "The File Drawer Problem and Tolerance for Null Results," Psychological Bulletin, Vol. 86, No. 3, 1979.
  37. Annie Franco, Neil Malhotra, and Gabor Simonovits, "Publication bias in the social sciences," Science, 2014. 221 studies tracked through NSF-funded TESS program.
  38. Anne M. Scheel, Mitchell R.M.J. Schijen, and Daniel Lakens, "An Excess of Positive Results," Advances in Methods and Practices in Psychological Science, 2021.
  39. Daniele Fanelli, "Negative results are disappearing from most disciplines and countries," Scientometrics, Vol. 90, No. 3, 2012. Positive results: 70% (1990) to 86% (2007).
  40. John P.A. Ioannidis, "Why Most Published Research Findings Are False," PLOS Medicine, Vol. 2, Issue 8, August 30, 2005. Over 10,000 citations. Most accessed PLOS article.
  41. For the structural analysis of how incentives shape medical recommendations, see /doctor on this site. For James C. Scott's analysis of how the Science Citation Index became "a force in the world, capable of generating its own observations," see "Two Cheers for Anarchism" (2012), Fragment 24. Scott quotes Goodhart's law: "when a measure becomes a target it ceases to be a good measure." For Bastiat's framework of the seen vs. the unseen applied to publication metrics, see "That Which Is Seen and That Which Is Not Seen" (1850).
  42. See Hayek on the knowledge problem applied to distributed review systems: "The Use of Knowledge in Society" (1945). Peer review compresses contextual knowledge about a study's validity into a binary accept/reject signal.
  43. The mortgage advice was correct when price-to-income was 3–4x. See /mortgage-inherited.
  44. School ratings measure income, not quality. See /school.
  45. Brand measures recognition, not the relationship. See /brand.
  46. The degree measures completion, not learning. 80% signaling. See Bryan Caplan, "The Case Against Education" (2018), and /degree.
  47. Declan Walsh, "The Lucrative Business of Fake Diplomas," New York Times, May 17, 2015. Axact: 370+ fake websites, 200,000+ customers, 197 countries, at least $140 million.
  48. GAO-04-771T, "Diploma Mills," testimony May 11, 2004. 463 federal employees, 28 senior officials, 3 at National Nuclear Security Administration with security clearances.
  49. DOJ/FBI/HHS-OIG, Operation Nightingale, 2023. 7,600 fake nursing diplomas. 25 defendants charged. Over $100 million in revenue.
  50. ATLAS Collaboration, "Observation of a new particle in the search for the Standard Model Higgs boson," Physics Letters B, Vol. 716, September 2012. CMS Collaboration, "Observation of a new boson at a mass of 125 GeV," Physics Letters B, Vol. 716, September 2012. Both detectors independently observed the signal at 5-sigma significance; joint announcement July 4, 2012.
  51. Scheel et al. (2021), same study as footnote 38.
  52. Registered Reports journal adoption: 11 in 2015, 300+ by 2025. Source: Center for Open Science.
  53. ClinicalTrials.gov: 530,000+ registered studies as of March 2025. Source: NLM Director's blog, April 2, 2025.
  54. NIH Data Management and Sharing Policy, effective January 25, 2023.
  55. Daniele Fanelli, "Why Growing Retractions Are (Mostly) a Good Sign," PLOS Medicine, 2013.
  56. Richard Horton, "Offline: What is medicine's 5 sigma?" The Lancet, April 11, 2015.