Ambiguities or the issues we face with AI in medicine

Abstract

With AI gradually becoming more relevant to healthcare, we are running into a diverse set of issues related to ambiguous medical data, expert disagreement, and biased outcomes. For AI to make accurate medical predictions, significant improvements in data collection, standardization, and ethical oversight are necessary, which come with their own set of additional challenges. In this thought piece I will lay out the issue of what ambiguities are, how they get to be and why they are so problematic in the medical AI context.

Preface

This is one text of a collection of essays and thought pieces that lies at the intersection of AI and other topics.

I’d highly appreciate to receive ideas, feedback and opinions, as well as engage in meaningful discussions about the topics I cover in this collection of essays, which I seek to publish over the course of the next weeks and months.

I am aware that they might have some typos, punctuation or grammatical errors, as I am just an individual writing this is my free time off work.

I hope to engage with a community of people who share my passion for AI, and I’d highly appreciate getting some other perspectives on these topics as these texts are based on my understanding of AI, the world and some additional thoughts and notions. I might not explain things to their full extent, or in a way that makes my chain of thoughts a little bit hard to understand if you come from a different domain, so if there are sections within the texts that are not explained sufficiently feel free to reach out. For the sake of not burdening you with too much to read there will also be some over-simplifications in these texts.

Please don’t see this as something set in stone. I am open to hearing different perspectives and broadening my view on this matter, as I am sure that there are points of view out there that are so far out of my field of view that I am unable to consider them as of now.

Introduction—Ambiguities in the real world

We primarily define our world through various forms of communication—whether verbal, visual, or written. These forms of communication are how we express ourselves, our feelings and share our experiences with others. Without words, it would be incredibly difficult to convey concepts or share our thoughts. For instance, the term “ghosting” didn’t exist just a few years ago, but the behavior it describes certainly did. Even though the phenomenon was happening, it wasn’t widely recognized or labeled, making it harder to talk about. Without words, we lack the awareness and means to express many things that happen in our daily lives effectively.

However, words are inherently ambiguous. They often represent concepts that people interpret differently because we all perceive the world uniquely. Many of the things we try to describe are abstract representations of reality.

Let’s start with a easy to grasp concept: Take the color “blue,” for example. While some concepts, like basic colors, seem easy to grasp, they can still be fluid. Most people might agree that a certain shade is blue, but as we move toward different shades, fewer people will agree on whether those shades can still be classified as “blue.” This is where ambiguity begins.

a grid table of different shades of blue merging into green

Image (AI-generated) depicting different shades of blue. There will be hues of blue within this image where 95100 people agree that it’s blue and other hues where just 23100 people would claim the same.

Ambiguities in the medical context

Ambiguity becomes problematic when dealing with AI. Ideally, we want the input to be as clear and unambiguous as possible to ensure accurate processing. However, some fields—like medicine—are full of ambiguous definitions. For example, where exactly does sickness begin, and health end? Simplifying these distinctions is difficult. In medicine, we rely on panels of experts to establish what is known as the “gold standard” for specific definitions. But even these experts may disagree when faced with ambiguous cases. Additionally, the gold standard itself can change over time as new diagnostic methods are developed, requiring constant updates and improvements.

Ambiguity in data and definition is a critical problem when dealing with AI in the medical domain. Medicine is a high-stakes environment where accuracy is critical because, at the end of the day, it’s real human lives we are dealing with. When input data is ambiguous, there is room for multiple interpretations, making AI models prone to errors when tested on new data or causing unreliable outcomes in general.

Let’s consider a more concrete example: A very typical use case of AI in medicine lies in diagnosing radiographs. For the sake of simplicity my example will be focusing on dental radiographs. Most people are somewhat familiar with dentistry, where dentists use X-rays to determine the presence of cavities. Even among experts, there can be disagreements—some may argue about the size of the cavity, while others may debate whether there is a cavity at all. To make matters worse, different dentists might recommend entirely different treatments for the same patient based on the same X-ray. We run into further ambiguities at the pixel level when segmenting lesions on radiographs to train AI models. To train an AI for medical diagnostics, we need to define a “ground truth.” But how do we do this when experts themselves disagree on where a disease begins, where it ends, and what the best treatment is? How can we identify the most accurate answer if there is no consensus among the experts?

This raises the question: Can we build reliable AI models for medicine under these circumstances, and if so, how do we go about doing this?

Further Implications of AI in Medicine

When thinking more intensively about the implications of AI within medicine, additional complexities arise. While we aim to create highly accurate models, we also want to cross-link patient data with other patients’ data to identify broader patterns. The more data we have, the better we can use AI to find trends in larger populations. For example, we might notice a rise in certain diseases within a specific demographic or region, or we could better predict the causes of specific cancers. AI might even help us link seemingly unrelated conditions, such as a patient taking a specific drug for a bowel disease with a lowered risk of osteoporosis or Alzheimer’s later in life. These connections could go unnoticed without AI, as they span different medical domains.

However, the prerequisites are clear. We need data – and massive amounts of it, which comes with serious ethical implications. How do we collect the data while ensuring patient privacy? How do we prevent biases from creeping into the data, which could lead to maladaptive algorithms that negatively impact patient outcomes? What will happen if the patterns that emerge from the data and can make predictions that are insanely accurate? Who might be interested in using these? Insurances? What happens when an AI model predicts that a patient’s hospital stay will be longer than average, and this prediction is used by insurance companies to increase costs for the patient? Will the patients of the future potentially need to foot a higher bill due to AI’s predictive algorithms?

Another limitation is that AI can only process the data it is given- it cannot take into account what we don’t feed it. If there’s a link between a behavior (B) and a pathology (P) but we don’t properly define what P looks like, the AI will run into issues. Similarly, ambiguities arise when trying to define behavior B. For instance, if we’re tracking sugar consumption, how do we define “excessive”? Should we make correlations of how much sugar is too much in relation to development of certain diseases? Is anything above 30 grams per day the threshold for concern? If so, how do we measure this accurately? The patients are unlikely to track their sugar intake down to the gram. The data we have in medicine is riddled with inaccuracies and missing data, and the data we do have might also be insufficient as it’s either too little or too specific to draw broad conclusions. Even in medical literature, contradictions are common, with experts arriving at vastly different conclusions, sometimes due to funding biases or personal beliefs. I am aware that meta-analyses exist to address some of these issues, but still there is plenty of ambiguity within this field. It’s hard to train models to make reliable predictions if we don’t know what is actually true.

Moving Forward

Taking all this into consideration, using AI in medicine must be done in tactful manner. AI in healthcare is an evolving field, and only a select group of experts currently possess both the technical and medical knowledge to fully grasp its utility and limitations. However, the field is growing, and there is hope that, in the coming years, more interdisciplinary teams will form to solve these problems. Personally, I believe there are many untapped areas where medical AI applications remain to be explored, and I’m looking forward to what is to come in the next decade.

No comments.