the medical records should automatically include Bayesian probability data on symptoms to help nurses recognize when the diagnosis doesn’t fit
Medical expert systems are getting pretty good, I don’t see why you wouldn’t just jump straight to an auto-updated list of most likely diagnoses (generated by a narrow AI) given the current list of symptoms and test results.
Most patient cases are so easy and common that filling forms for an AI would greatly slow the system down. AI could be useful when the diagnosis isn’t clear however. A sufficiently smart AI could pick up the relevant data from the notes but usually the picture that the diagnostician has in their mind is much more complete than any notes they make.
Note that I’m looking at this from a perspective where implementing theoretically smart systems has usually done nothing but increased my workload.
Most patient cases are so easy and common that filling forms for an AI would greatly slow the system down.
I am assuming you’re not filling out any forms specially for the AI—just that the record-keeping system is computerized and the AI has access to it. In trivial cases the AI won’t have much data (e.g. no fever, normal blood pressure, complains of a running nose and cough, that’s it) and its diagnoses will be low-credence, but that’s fine, you as a doctor won’t need its assistance in those cases.
The AI would need to know natural language to be of any use or else it will miss most of the relevant data. I suppose Watson is pretty close to that and have read that it’s tested in some hospitals. I wonder how this is implemented. I suspect doctors carry a lot more data in their heads than is readily apparent and much of this data will never make it to their notes and thus to the computerized records.
Taking a history, evaluating it’s reliability and using the senses to observe the patients are something machines can’t do for quite some time. On top of this I roughly know hundreds of patients now that I will see time and again and this helps immensely when judging their most acute presentations. By this I don’t mean I know them as lists of symptoms, but I know their personalities too and how this affects how they tell their stories and how seriously they take their symptoms from minor complaints to major problems. I could never take the approach of jumping from a hospital to hospital now that I’ve experienced this first hand.
The AI would need to know natural language to be of any use or else it will miss most of the relevant data. I suppose Watson is pretty close to that and have read that it’s tested in some hospitals. I wonder how this is implemented. I suspect doctors carry a lot more data in their heads than is readily apparent and much of this data will never make it to their notes and thus to the computerized records.
This is the reason Watson is a game-changer, despite expert prediction systems (using linear regression!) performing at the level of expert humans for ~50 years. Doctors may carry a lot of information in their heads, but I’ve yet to meet a person that’s able to mentally invert matrices of non-trivial size, which helps quite a bit with determining the underlying structure of the data and how best to use it.
Taking a history, evaluating it’s reliability and using the senses to observe the patients are something machines can’t do for quite some time.
I think machines have several comparative advantages here. An AI with basic conversational functions can take a history, and is better at evaluating some parts of the reliability and worse at others. It can compare with ‘other physicians’ more easily, or check public records, but probably can’t determine whether or not it’s a coherent narrative as easily (“What is Toronto?”). A webcam can measure pulse rate just by looking, and so I suspect it’ll be about as good at detecting deflection and lying as the average doctor. (I don’t remember seeing doctors as being particularly good at lie-detection, but it’s been a while since I’ve read any of the lie-detection literature.)
I could never take the approach of jumping from a hospital to hospital now that I’ve experienced this first hand.
Note that if the AI is sufficiently broadly used (here I’m imagining, say, the NHS in the UK using just one) then everyone will always have access to a doctor that’s known them as long as they’ve been in the system.
despite expert prediction systems (using linear regression!) performing at the level of expert humans for ~50 years.
Is this because using them is incredibly slow or something else?
A webcam can measure pulse rate just by looking, and so I suspect it’ll be about as good at detecting deflection and lying as the average doctor. (I don’t remember seeing doctors as being particularly good at lie-detection, but it’s been a while since I’ve read any of the lie-detection literature.)
Lies make no sense medically, or make too much sense. Once I’ve spotted a few lies, many of them fit a stereotypical pattern many patients use even if there aren’t any other clues. I don’t need to rely on body language much.
People also misremember things, or have a helpful relative misremember things for them, or home care providers feeding their clueless preliminary diagnoses for these people. People who don’t remember fill in the gap with something they think is plausible. Some people are also psychotic or don’t even remember what year it is or why they came in the first place. Some people treat every little ache like it’s the end of the world and some don’t seem to care if their leg’s missing.
I think even an independent AI could make up for many of its faults simply by being more accurate at interpreting the records and current test results.
I hope that when an AI can do my job I don’t need a job anymore :)
Is this because using them is incredibly slow or something else?
My understanding is that the ~4 measurements the system would use as inputs were typically measured by the doctor, and by the time the doctor had collected the data they had simultaneously come up with their own diagnosis. Typing the observations into the computer to get the same level of accuracy (or a few extra percentage points) rarely seemed worth it, and turning the doctor from a diagnostician to a tech was, to put it lightly, not popular with doctors. :P
There are other arguments which would take a long time to go into. One is “but what about X?”, where the linear regression wouldn’t take into account some other variable that the human could take into account, and so the human would want an override option. But, as one might expect, the only way for the regression to outperform the human is for the regression to be right more often than not when the two of them disagree, and humans are unfortunately not very good at determining whether or not the case in front of them is a special case where an override will increase accuracy or a normal case where an override will decrease accuracy. Here’s probably the best place to start if interested in reading more.
A rather limited subset of the natural language, I think it’s a surmountable problem.
I suspect doctors carry a lot more data in their heads than is readily apparent … I roughly know hundreds of patients now that I will see time and again and this helps immensely when judging their most acute presentations.
All true, which is why I think a well-designed diagnostic AI will work in partnership with a doctor instead of replacing him.
I agree with you, but I fear that makes for a boring conversation :)
The language is already relatively standardized and I suppose you could standardize it more to make it easier for the AI. I suspect any attempt to mold the system for an AI would meet heavy resistance however.
Largely for the same reasons that weather forecasting still involves human meteorologists and the draft in baseball still includes human scouts: a system that integrates both human and automated reasoning produces better outcomes, because human beings can see patterns a lot better than computers can.
I am not saying this narrow AI should be given direct control of IV drips :-/
I am saying that a doctor, when looking at a patient’s chart, should be able to see what the expert system considers to be the most likely diagnoses and then the doctor can accept one, or ignore them all, or order more tests, or do whatever she wants.
A system which automates almost all diagnoses would do that.
No, I don’t think so because even if you rely on an automated diagnosis you still have to treat the patient.
Even assuming that the machine would not be modified to give treatment recommendations, that wouldn’t change the effect I’m concerned about. If the doctor is accustomed to the machine giving the correct diagnosis for every patient, they’ll stop remembering how to diagnose disease and instead remember how to use the machine. It’s called “transactive memory”.
I’m not arguing against a machine with a button on it that says, “Search for conditions matching recorded symptoms”. I’m not arguing against a machine that has automated alerts about certain low-probability risks—if there was a box that noted the conjunction of “from Liberia” and “temperature spiking to 103 Fahrenheit” in Thomas Eric Duncan during his first hospital visit, there’d probably only be one confirmed case of ebola in the US instead of three, and Duncan might be alive today. But no automated system can be perfectly reliable, and I want doctors who are accustomed to doing the job themselves on the case whenever the system spits out, “No diagnosis found”.
You are using the wrong yardstick. Ain’t no thing is perfectly reliable. What matters is whether an automated system will be more reliable than the alternative—human doctors.
Commercial aviation has a pretty good safety record while relying on autopilots. Are you quite sure that without the autopilot the safety record would be better?
whenever the system spits out, “No diagnosis found”.
And why do you think a doctor will do better in this case?
I was going to say “doctor’s don’t have the option of not picking the diagnosis”, but that’s actually not true; they just don’t have the option of not picking a treatment. I’ve had plenty of patients who were “symptom X not yet diagnosed” and the treatment is basically supportive, “don’t let them die and try to notice if they get worse, while we figure this out.” I suspect that often it never gets figured out; the patient gets better and they go home. (Less so in the ICU, because it’s higher stakes and there’s more of an attitude of “do ALL the tests!”)
they just don’t have the option of not picking a treatment.
They do, they call the problem “psychosomatic” and send you to therapy or give you some echinacea “to support your immune system” or prescribe “something homeopathic” or whatever… And in very rare cases especially honest doctors may even admit that they do not have any idea what to do.
Because the cases where the doctor is stumped are not uniformly the cases where the computer is stumped. The computer might be stumped because a programmer made a typo three weeks ago entering the list of symptoms for diphtheria, because a nurse recorded the patient’s hiccups as coughs, because the patient is a professional athlete whose resting pulse should be three standard deviations slower than the mean … a doctor won’t be perfectly reliable either, but like a professional scout who can say, “His college batting average is .400 because there aren’t many good curveball pitchers in the league this year”, a doctor can detect low-prior confounding factors a lot faster than a computer can.
Well, let’s imagine a system which actually is—and that might be a stretch—intelligently designed.
This means it doesn’t say “I diagnose this patient with X”. It says “Here is a list of conditions along with their probabilities”. It also doesn’t say “No diagnosis found”—it says “Here’s a list of conditions along with their probabilities, it’s just that the top 20 conditions all have probabilities between 2% and 6%”.
It also says things like “The best way to make the diagnosis more specific would be to run test A, then test B, and if it came back in this particular range, then test C”.
A doctor might ask it “What about disease Y?” and the expert system will answer “It’s probability is such-and-such, it’s not zero because of symptoms Q and P, but it’s not high because test A came back negative and test B showed results in this range. If you want to get more certain with respect to disease Y, use test C.”
And there probably would be button which says “Explain” and pressing it will show the precisely what leads to the probability of disease X being what it is, and the doctor should be able to poke around it and say things like “What happens if we change these coughs to hiccups?”
An intelligently designed expert system often does not replace the specialist—it supports her, allows her to interact with it, ask questions, refine queries, etc.
If you have a patient with multiple nonspecific symptoms who takes a dozen different medications every day, a doctor cannot properly evaluate all the probabilities and interactions in her head. But an expert system can. It works best as a teammate of a human, not as something which just tells her.
Well, let’s imagine a system which actually is—and that might be a stretch—intelligently designed.
Us? I’m a mechanical engineer. I haven’t even read The Checklist Manifesto. I am manifestly unqualified either to design a user interface or to design a system for automated diagnosis of disease—and, as decades of professional failure have shown, neither of these is a task to be lightly ventured upon by dilettantes. The possible errors are simply too numerous and subtle for me to be assured of avoiding them. Case in point: prior to reading that article about Air France Flight 447, it never occurred to me that automation had allowed some pilots to completely forget how to fly a plane.
The details of automation are much less important to me than the ability of people like Swimmer963 to be a part of the decision-making process. Their position grants them a much better view of what’s going on with one particular patient than a doctor who reads a chart once a day or a computer programmer who writes software intended to read billions of charts over its operational lifespan. The system they are incorporated in should take advantage of that.
Medical expert systems are getting pretty good, I don’t see why you wouldn’t just jump straight to an auto-updated list of most likely diagnoses (generated by a narrow AI) given the current list of symptoms and test results.
Most patient cases are so easy and common that filling forms for an AI would greatly slow the system down. AI could be useful when the diagnosis isn’t clear however. A sufficiently smart AI could pick up the relevant data from the notes but usually the picture that the diagnostician has in their mind is much more complete than any notes they make.
Note that I’m looking at this from a perspective where implementing theoretically smart systems has usually done nothing but increased my workload.
I am assuming you’re not filling out any forms specially for the AI—just that the record-keeping system is computerized and the AI has access to it. In trivial cases the AI won’t have much data (e.g. no fever, normal blood pressure, complains of a running nose and cough, that’s it) and its diagnoses will be low-credence, but that’s fine, you as a doctor won’t need its assistance in those cases.
The AI would need to know natural language to be of any use or else it will miss most of the relevant data. I suppose Watson is pretty close to that and have read that it’s tested in some hospitals. I wonder how this is implemented. I suspect doctors carry a lot more data in their heads than is readily apparent and much of this data will never make it to their notes and thus to the computerized records.
Taking a history, evaluating it’s reliability and using the senses to observe the patients are something machines can’t do for quite some time. On top of this I roughly know hundreds of patients now that I will see time and again and this helps immensely when judging their most acute presentations. By this I don’t mean I know them as lists of symptoms, but I know their personalities too and how this affects how they tell their stories and how seriously they take their symptoms from minor complaints to major problems. I could never take the approach of jumping from a hospital to hospital now that I’ve experienced this first hand.
This is the reason Watson is a game-changer, despite expert prediction systems (using linear regression!) performing at the level of expert humans for ~50 years. Doctors may carry a lot of information in their heads, but I’ve yet to meet a person that’s able to mentally invert matrices of non-trivial size, which helps quite a bit with determining the underlying structure of the data and how best to use it.
I think machines have several comparative advantages here. An AI with basic conversational functions can take a history, and is better at evaluating some parts of the reliability and worse at others. It can compare with ‘other physicians’ more easily, or check public records, but probably can’t determine whether or not it’s a coherent narrative as easily (“What is Toronto?”). A webcam can measure pulse rate just by looking, and so I suspect it’ll be about as good at detecting deflection and lying as the average doctor. (I don’t remember seeing doctors as being particularly good at lie-detection, but it’s been a while since I’ve read any of the lie-detection literature.)
Note that if the AI is sufficiently broadly used (here I’m imagining, say, the NHS in the UK using just one) then everyone will always have access to a doctor that’s known them as long as they’ve been in the system.
Is this because using them is incredibly slow or something else?
Lies make no sense medically, or make too much sense. Once I’ve spotted a few lies, many of them fit a stereotypical pattern many patients use even if there aren’t any other clues. I don’t need to rely on body language much.
People also misremember things, or have a helpful relative misremember things for them, or home care providers feeding their clueless preliminary diagnoses for these people. People who don’t remember fill in the gap with something they think is plausible. Some people are also psychotic or don’t even remember what year it is or why they came in the first place. Some people treat every little ache like it’s the end of the world and some don’t seem to care if their leg’s missing.
I think even an independent AI could make up for many of its faults simply by being more accurate at interpreting the records and current test results.
I hope that when an AI can do my job I don’t need a job anymore :)
My understanding is that the ~4 measurements the system would use as inputs were typically measured by the doctor, and by the time the doctor had collected the data they had simultaneously come up with their own diagnosis. Typing the observations into the computer to get the same level of accuracy (or a few extra percentage points) rarely seemed worth it, and turning the doctor from a diagnostician to a tech was, to put it lightly, not popular with doctors. :P
There are other arguments which would take a long time to go into. One is “but what about X?”, where the linear regression wouldn’t take into account some other variable that the human could take into account, and so the human would want an override option. But, as one might expect, the only way for the regression to outperform the human is for the regression to be right more often than not when the two of them disagree, and humans are unfortunately not very good at determining whether or not the case in front of them is a special case where an override will increase accuracy or a normal case where an override will decrease accuracy. Here’s probably the best place to start if interested in reading more.
A rather limited subset of the natural language, I think it’s a surmountable problem.
All true, which is why I think a well-designed diagnostic AI will work in partnership with a doctor instead of replacing him.
I agree with you, but I fear that makes for a boring conversation :)
The language is already relatively standardized and I suppose you could standardize it more to make it easier for the AI. I suspect any attempt to mold the system for an AI would meet heavy resistance however.
Largely for the same reasons that weather forecasting still involves human meteorologists and the draft in baseball still includes human scouts: a system that integrates both human and automated reasoning produces better outcomes, because human beings can see patterns a lot better than computers can.
Also, we would be well-advised to avoid repeating the mistake made by the commercial-aviation industry, which seems to have fostered such extreme dependence on the automated system that many ‘pilots’ don’t know how to fly a plane. A system which automates almost all diagnoses would do that.
I am not saying this narrow AI should be given direct control of IV drips :-/
I am saying that a doctor, when looking at a patient’s chart, should be able to see what the expert system considers to be the most likely diagnoses and then the doctor can accept one, or ignore them all, or order more tests, or do whatever she wants.
No, I don’t think so because even if you rely on an automated diagnosis you still have to treat the patient.
Even assuming that the machine would not be modified to give treatment recommendations, that wouldn’t change the effect I’m concerned about. If the doctor is accustomed to the machine giving the correct diagnosis for every patient, they’ll stop remembering how to diagnose disease and instead remember how to use the machine. It’s called “transactive memory”.
I’m not arguing against a machine with a button on it that says, “Search for conditions matching recorded symptoms”. I’m not arguing against a machine that has automated alerts about certain low-probability risks—if there was a box that noted the conjunction of “from Liberia” and “temperature spiking to 103 Fahrenheit” in Thomas Eric Duncan during his first hospital visit, there’d probably only be one confirmed case of ebola in the US instead of three, and Duncan might be alive today. But no automated system can be perfectly reliable, and I want doctors who are accustomed to doing the job themselves on the case whenever the system spits out, “No diagnosis found”.
You are using the wrong yardstick. Ain’t no thing is perfectly reliable. What matters is whether an automated system will be more reliable than the alternative—human doctors.
Commercial aviation has a pretty good safety record while relying on autopilots. Are you quite sure that without the autopilot the safety record would be better?
And why do you think a doctor will do better in this case?
I was going to say “doctor’s don’t have the option of not picking the diagnosis”, but that’s actually not true; they just don’t have the option of not picking a treatment. I’ve had plenty of patients who were “symptom X not yet diagnosed” and the treatment is basically supportive, “don’t let them die and try to notice if they get worse, while we figure this out.” I suspect that often it never gets figured out; the patient gets better and they go home. (Less so in the ICU, because it’s higher stakes and there’s more of an attitude of “do ALL the tests!”)
They do, they call the problem “psychosomatic” and send you to therapy or give you some echinacea “to support your immune system” or prescribe “something homeopathic” or whatever… And in very rare cases especially honest doctors may even admit that they do not have any idea what to do.
Because the cases where the doctor is stumped are not uniformly the cases where the computer is stumped. The computer might be stumped because a programmer made a typo three weeks ago entering the list of symptoms for diphtheria, because a nurse recorded the patient’s hiccups as coughs, because the patient is a professional athlete whose resting pulse should be three standard deviations slower than the mean … a doctor won’t be perfectly reliable either, but like a professional scout who can say, “His college batting average is .400 because there aren’t many good curveball pitchers in the league this year”, a doctor can detect low-prior confounding factors a lot faster than a computer can.
Well, let’s imagine a system which actually is—and that might be a stretch—intelligently designed.
This means it doesn’t say “I diagnose this patient with X”. It says “Here is a list of conditions along with their probabilities”. It also doesn’t say “No diagnosis found”—it says “Here’s a list of conditions along with their probabilities, it’s just that the top 20 conditions all have probabilities between 2% and 6%”.
It also says things like “The best way to make the diagnosis more specific would be to run test A, then test B, and if it came back in this particular range, then test C”.
A doctor might ask it “What about disease Y?” and the expert system will answer “It’s probability is such-and-such, it’s not zero because of symptoms Q and P, but it’s not high because test A came back negative and test B showed results in this range. If you want to get more certain with respect to disease Y, use test C.”
And there probably would be button which says “Explain” and pressing it will show the precisely what leads to the probability of disease X being what it is, and the doctor should be able to poke around it and say things like “What happens if we change these coughs to hiccups?”
An intelligently designed expert system often does not replace the specialist—it supports her, allows her to interact with it, ask questions, refine queries, etc.
If you have a patient with multiple nonspecific symptoms who takes a dozen different medications every day, a doctor cannot properly evaluate all the probabilities and interactions in her head. But an expert system can. It works best as a teammate of a human, not as something which just tells her.
Us? I’m a mechanical engineer. I haven’t even read The Checklist Manifesto. I am manifestly unqualified either to design a user interface or to design a system for automated diagnosis of disease—and, as decades of professional failure have shown, neither of these is a task to be lightly ventured upon by dilettantes. The possible errors are simply too numerous and subtle for me to be assured of avoiding them. Case in point: prior to reading that article about Air France Flight 447, it never occurred to me that automation had allowed some pilots to completely forget how to fly a plane.
The details of automation are much less important to me than the ability of people like Swimmer963 to be a part of the decision-making process. Their position grants them a much better view of what’s going on with one particular patient than a doctor who reads a chart once a day or a computer programmer who writes software intended to read billions of charts over its operational lifespan. The system they are incorporated in should take advantage of that.