色盒直播

All science papers must state how confident we should be in them

In this age of widespread mis/disinformation, non-scientists need help to better grasp which claims bear rigorous scrutiny, says Gary Atkinson

十月 10, 2024
Montage of a graph between the Tacoma Narrows Bridge collapsing in Washington to illustrate All science papers must declare how confident  we should be in them
Source: Getty Images (edited)

Two engineers are considering a?design for an?upgrade to a?bridge to?allow it to?withstand stronger winds. They analyse the design independently using the same method and simultaneously estimate that it?will enable the bridge to?withstand wind speeds of up to 130mph (210kmh): adequate for the local area, where the wind never exceeds 100mph.

However, the more astute of?the engineers considers the level of?uncertainty of?the findings and determines that the value of?130 is?accurate only to?±40. This means that we?can’t be?certain that the bridge would withstand a?wind above 90mph (although it?might manage up to?170mph). The astute engineer, therefore, advises further analysis, while their more naive colleague approves the design, unknowingly risking calamity.

I use a similar story to teach my students about the significance of uncertainty analysis, the scientific process for handling imprecise data. In truth, however, even science and engineering students don’t find the concept especially exciting. Unsurprisingly, then, the public, politicians and the press usually overlook uncertainty completely.

It doesn’t help that when findings are disseminated outside academia, margins of error are often omitted entirely for the sake of brevity and impact. The result is that laypeople might not realise that a paper is reporting only preliminary findings. For example, without scientific training, it might not be apparent that a new medication has a large range of possible efficacies or is useful only to a given demographic.

This state of affairs can’t be allowed to continue. It is true that just as politicians did not enter politics to study uncertainties, most scientists and engineers did not enter their profession to engage in policy formulation or public engagement. However, it is becoming ever more critical in this age of widespread mis/disinformation for academics to help outsiders to better grasp which claims bear rigorous scrutiny.

My proposal is that scientists and science communicators should include a simple, prominent statement about the level of confidence for each public-domain output (paper, press release, oral presentation) they produce.

This might have minimal impact on those deliberately spreading misinformation, who will conveniently overlook the confidence statement if it fails to support their position, but at least it might stop some of the more reputable media from making gross extrapolations from papers that are provisional and exploratory.

The stated confidence level should not be nestled deep inside the core text of a paper. It should be presented at the start and be specifically aimed at general readership – including guidance on how far it is reasonable to extrapolate the findings.

One option is to allow for a free-text statement up to a strict word limit. For example: “The findings of this research attain a high degree of confidence within the remit of the study. However, there are highly restricting assumptions, so widespread adoption of the technique and findings require substantial further research and independent corroboration.” Alternatively, the author might select from a list of options, perhaps akin to the nine levels of proof recognised in US legal standards, ranging from “some evidence” to “beyond reasonable doubt”.

It would be important to stress that lower-confidence papers are still potentially highly valuable. Such contributions would be seen as perfectly valid within the scientific community, and further research would hopefully build on them, either increasing the confidence level or finding pitfalls.

Part of the challenge is that, aside from the inability and unwillingness of politicians and others to respect uncertainty, scientists are, themselves, flawed individuals. As an entity, science has evolved for optimal handling of uncertainty. But in practice, scientists are human with selfish needs like everyone else.

When someone boldly claims high confidence, they are inviting greater scrutiny, provoking others to repeat and corroborate or disprove their findings. Even so, they may be tempted to fraudulently claim high confidence to get a paper published and entice press attention, thus attracting many followers. Past events such as the false claim that the MMR (measles, mumps and rubella) vaccine causes autism suggest that fortune favours those who attract followers irrespective of whether their research is later discredited.

At the same time, even high-confidence papers should not be communicated as “pure fact”, any more than “beyond reasonable doubt” entails that miscarriages of justice are impossible. There should be no shame in perfectly honest research that initially had high confidence being later disproven. Science operates that way.

A contrary problem is that some authors might be too nervous to claim that their work has a high degree of confidence. So perhaps paper reviewers should be involved in the classification of confidence levels. Alternatively, expert panels could label some of the more publicised papers.

None of this is easy, and there are other pitfalls to confidence labelling that I?don’t have space to address here. This article is just a conversation starter: we might need several iterations to get the solution right. But if the pandemic taught us anything, it is that it is vital for the confidence of scientific findings to be better articulated to the public. Prominent statements on uncertainty might be at least part of the solution.

Gary A. Atkinson is associate professor and BEng robotics programme leader at the University of the West of England.

请先注册再继续

为何要注册?

  • 注册是免费的,而且十分便捷
  • 注册成功后,您每月可免费阅读3篇文章
  • 订阅我们的邮件
注册
Please 登录 or 注册 to read this article.

相关文章

Reader's comments (9)

Fascinating thought piece. The idea of a simple rating system, subject to peer review akin to the paper content feels like a dramatic step forward. Particularly in an era where putting an unreviewed paper into the public sphere on arxiv is common practice. My question would be... Does the lay person understand the difference in reliability between a double-blind, peer reviewed paper in a leading journal, and a professional looking but unreviewed paper on arxiv?
Excellent idea to be more honest and open about a paper being published. No shame at all in authors noting their findings are tentative or provisional. Much better than mistakenly claiming gold standard
I consider this to be an excellent article. What it is essentially saying is that science-related technical papers should adopt a more scientific approach to quantitative reporting of researchers' findings - which seems completely reasonable. The bridge example near the top is a useful one. In engineering, products tend to work, and fail, in performance ranges, or modes of operation. Therefore, specifying their expected performance, or failure, in terms if target values, and expected variation (e.g. normal distribution) around those values, is the more scientific approach and one that would be expected to lead to more effective utilisation and avoidance of failure. The author's suggested approach should lead to more confidence in claims made in technical papers - which would ultimately be to the benefit of everyone.
In biology it almost impossible to give a confidence assesment of the reliability of the observations made in biological study. But in a good paper the conscientious scientists write a critical discussion that weighs the pros and cons of the paper and sets it within the context of previous studies. It is not possible for a lay person to read and understand this type of paper. However good journalists and university press officers with scientific training should be able to read understand and transmit the information in a way a lay person can understand. Adding more work to unpaid reviewers is an unacceptable solution. Attempting boil down a thoughtful and considered discussion to a boiler plate statement seems lazy. Read the discussion evaluate the data and come to a conclusion about the work. That's what journal clubs are supposed to teach people
I think Dr. Atkinson is onto something here. The findings in highly technical academic papers can be difficult for the average person to interpret. One example that comes to mind is the significance of advancements in medical research, such as breakthroughs in cancer treatment. A problem arises when the mainstream media extrapolates from early, tentative findings or research with limited patient application, perhaps only relevant to a specific type of cancer. The situation is made worse when media reports are based on unreviewed papers that contain exaggerated claims. This can have a hidden impact on more cautious researchers, whose papers and grant applications may be rejected for appearing 'behind the curve' in the eyes of reviewers and assessors misled by these overblown claims—whether from the researchers themselves or through media spin. It might help if journal reviewers, especially associate editors, received some form of compensation for their work, as they often seem to be working for free. I’ve been doing that for years myself. Recently, an industrial entrepreneur asked me why I do such work for free, and I didn’t really have a good answer.
This is a really intetesting idea. I think one area that this would be particularly useful is in medicine. With overstretched medical systems and an increase in patients taking charge of their own health, a lot of crohnically ill people do their own research into managing their conditions. A simple confidence rating could support those people in knowing which research to pay more attention to. It would also allow researchers to publish suggestive results under a low confidence rating in order to get those ideas out into the academic sphere for others to expand upon, without fear of ridicule. I wonder what other ways we could formalise and modernise academic research publishing.
In my experience policy makers are are short of time and can be prone to turning evidence lead policy in to policy lead evidence. A short statement, or a RAG rating on the confidence of the authors of a paper might help policy makers engage with the evidence in a more efficient way or, if they chose not to, allow their political opponents to challenge them in an effective way.
I'm with you, Gary! Even beyond science, so many important discussions degrade into polarised debates because we're bad at articulating uncertainty.
Good idea, although as one of the commenters has pointed out, this might not be easy in biological/psychological research - or perhaps they will more frequently have a low confidence score. Perhaps, instead of a "confidence" score, it could be something more akin to technology readiness levels, that gives an indication of how well developed/robust the experiments and data are?
ADVERTISEMENT