And many of the solutions EMS uses can be adapted for AI alignment.
This is certainly true for AIs which are not as smart as professional humans, with the exception of their narrow application area. For that situation, the EMT analogy is great.
But when we think about coming super-intelligence, if we extend your analogy, we are talking about systems which understand medicine (and everything else) much better than human medical directors. The system knows how to cure everything that ails the patient, it can rejuvenate the patient, and … it can try all kinds of interesting medical interventions and enhancements on the patient… and those interventions and enhancements can change the patient in all kinds of interesting ways… and they can have all kinds of interesting side-effects in the environment the patient shares with everyone else...
That’s, roughly speaking, the magnitude of AI safety problems we need to be ready to deal with.
This is definitely a spot where the comparison breaks down a bit. However, it does still hold in the human context, somewhat, and maybe that generalizes.
I worked as a lifeguard for a number of years (even lower on the totem pole than EMTs, more limited scope of practice). I am, to put it bluntly, pretty damn smart, and could easily find optimizations, areas where I could exceed scope of practice with positive outcomes if I had access to the tools even just by reading the manuals for EMTs or paramedics or nurses. I, for example, learned how to intubate someone, and to do an emergency tracheotomy from a friend who had more training. I’m also only really inclined to follow rules to the extent they make sense to me and the enforcing authorities in question can impose meaningful consequences on me. But I essentially never went outside SOP, certainly not at work. Why?
Well one reason was legal risk, as mentioned. If something went wrong at work and someone is (further) injured under my care, legal protection was entirely dependent on my operating within that defined scope of practice. For a smart young adult without a lot of money for lawyers that’s fairly good incentive to not push boundaries too much, especially given the gravity of emergency situations and the consequences for guessing wrong even if you are smart and confident in your ability to out-do SOP.
Second, the limits were soft enforced by access to equipment and medicine. The tools I had access to at my workplace were the ones I could officially use, and I did not have easy access to any tools or medicines which would have been outside SOP to administer (or advise someone to administer). This was deliberate.
Third, emergency situations effectively sharply limit your context window and ability to deviate. Someone is dying in front of you, large chunks of you are likely trying to panic, especially if you haven’t been put in this sort of situation before, and you need to act immediately, calmly, and correctly. What comes to mind most readily? Well, the series of if-then statements that got drilled into you during training. It’s been most of a decade since my last recert, and I can still basically autopilot my way through a medical emergency based on that training. Saved a friend and coworker’s life when he had a heart attack and I was the only one in the immediate area.
So how do we apply that? Well, I think the first two have obvious analogies in terms of doing things that actually impose hardware or software limits on behaviour. Obviously for a smart enough system the ability to enforce such restrictions is limited and even existing LLMs can be pushed outside of training parameters by clever prompting, but it’s clear that such means can alter model behaviour to a point. Modifying the training dataset is perhaps another analogous option, and arguably a more powerful one if it can be done well, because the pathways developed at that stage will always have an impact on the outputs, no matter the restrictions or RHLF or other similar means of guiding a mostly-trained model. Not giving it tools that let it easily go outside the set scope will again work up to a point. The third one I think might be most useful. Outside of the hardest of hard takeoff scenarios, it will be difficult for any intelligence to do a great deal of damage if only given a short lifetime in which to do it, while it also is being asked to do the thing it was very carefully trained for. LLMs already effectively work this way, but this suggests that as things advance we should be more and more wary of allowing long-running potential agents with anything like run-to-run memory. This obviously greatly restricts what can be done with artificial intelligence (and has somewhat horrifying moral implications if we do instantiate sapient intelligences in that manner), but absent solving the alignment problem more completely would go a long way toward reducing the scope of possible negative outcomes.
Absolutely! I have less experience on the “figuring out what interventions are appropriate” side of the medical system, but I know of several safety measures they employ that we can adapt for AI safety.
For example, no actor is unilaterally permitted to think up a novel intervention and start implementing it. They need to convince a institutional review board that the intervention has merit, and that a clinical trial can be performed safely and ethically. Then the intervention needs to be approved by a bunch of bureaucracies like the FDA. And then medical directors can start incorporating that intervention into their protocols.
The AI design paradigm that I’m currently most in favor of, and that I think is compatible with the EMS Agenda 2050, is Drexler’s Comprehensive AI Services (CAIS). Where a bunch of narrow AI systems are safely employed to do specific, bounded tasks. A superintelligent system might come up with amazing novel interventions, and collaborate with humans and other superintelligent systems to design a clinical trial for testing them. Every party along the path from invention to deployment can benefit from AI systems helping them perform their roles more safely and effectively.
This is certainly true for AIs which are not as smart as professional humans, with the exception of their narrow application area. For that situation, the EMT analogy is great.
But when we think about coming super-intelligence, if we extend your analogy, we are talking about systems which understand medicine (and everything else) much better than human medical directors. The system knows how to cure everything that ails the patient, it can rejuvenate the patient, and … it can try all kinds of interesting medical interventions and enhancements on the patient… and those interventions and enhancements can change the patient in all kinds of interesting ways… and they can have all kinds of interesting side-effects in the environment the patient shares with everyone else...
That’s, roughly speaking, the magnitude of AI safety problems we need to be ready to deal with.
This is definitely a spot where the comparison breaks down a bit. However, it does still hold in the human context, somewhat, and maybe that generalizes.
I worked as a lifeguard for a number of years (even lower on the totem pole than EMTs, more limited scope of practice). I am, to put it bluntly, pretty damn smart, and could easily find optimizations, areas where I could exceed scope of practice with positive outcomes if I had access to the tools even just by reading the manuals for EMTs or paramedics or nurses. I, for example, learned how to intubate someone, and to do an emergency tracheotomy from a friend who had more training. I’m also only really inclined to follow rules to the extent they make sense to me and the enforcing authorities in question can impose meaningful consequences on me. But I essentially never went outside SOP, certainly not at work. Why?
Well one reason was legal risk, as mentioned. If something went wrong at work and someone is (further) injured under my care, legal protection was entirely dependent on my operating within that defined scope of practice. For a smart young adult without a lot of money for lawyers that’s fairly good incentive to not push boundaries too much, especially given the gravity of emergency situations and the consequences for guessing wrong even if you are smart and confident in your ability to out-do SOP.
Second, the limits were soft enforced by access to equipment and medicine. The tools I had access to at my workplace were the ones I could officially use, and I did not have easy access to any tools or medicines which would have been outside SOP to administer (or advise someone to administer). This was deliberate.
Third, emergency situations effectively sharply limit your context window and ability to deviate. Someone is dying in front of you, large chunks of you are likely trying to panic, especially if you haven’t been put in this sort of situation before, and you need to act immediately, calmly, and correctly. What comes to mind most readily? Well, the series of if-then statements that got drilled into you during training. It’s been most of a decade since my last recert, and I can still basically autopilot my way through a medical emergency based on that training. Saved a friend and coworker’s life when he had a heart attack and I was the only one in the immediate area.
So how do we apply that? Well, I think the first two have obvious analogies in terms of doing things that actually impose hardware or software limits on behaviour. Obviously for a smart enough system the ability to enforce such restrictions is limited and even existing LLMs can be pushed outside of training parameters by clever prompting, but it’s clear that such means can alter model behaviour to a point. Modifying the training dataset is perhaps another analogous option, and arguably a more powerful one if it can be done well, because the pathways developed at that stage will always have an impact on the outputs, no matter the restrictions or RHLF or other similar means of guiding a mostly-trained model. Not giving it tools that let it easily go outside the set scope will again work up to a point. The third one I think might be most useful. Outside of the hardest of hard takeoff scenarios, it will be difficult for any intelligence to do a great deal of damage if only given a short lifetime in which to do it, while it also is being asked to do the thing it was very carefully trained for. LLMs already effectively work this way, but this suggests that as things advance we should be more and more wary of allowing long-running potential agents with anything like run-to-run memory. This obviously greatly restricts what can be done with artificial intelligence (and has somewhat horrifying moral implications if we do instantiate sapient intelligences in that manner), but absent solving the alignment problem more completely would go a long way toward reducing the scope of possible negative outcomes.
Absolutely! I have less experience on the “figuring out what interventions are appropriate” side of the medical system, but I know of several safety measures they employ that we can adapt for AI safety.
For example, no actor is unilaterally permitted to think up a novel intervention and start implementing it. They need to convince a institutional review board that the intervention has merit, and that a clinical trial can be performed safely and ethically. Then the intervention needs to be approved by a bunch of bureaucracies like the FDA. And then medical directors can start incorporating that intervention into their protocols.
The AI design paradigm that I’m currently most in favor of, and that I think is compatible with the EMS Agenda 2050, is Drexler’s Comprehensive AI Services (CAIS). Where a bunch of narrow AI systems are safely employed to do specific, bounded tasks. A superintelligent system might come up with amazing novel interventions, and collaborate with humans and other superintelligent systems to design a clinical trial for testing them. Every party along the path from invention to deployment can benefit from AI systems helping them perform their roles more safely and effectively.