To clarify, I was not critiquing the idea that we need to get “superintelligence unleashed on the world” correct the first try—that of course I do agree with. I was critiquing the more specific idea that we need to get AGI morality/safety correct the first try.
One could compare to ICBM missile defense systems. The US (and other nations) have developed that tech, and i’ts a case where you have to get the deployed product “right the first try”. However you can’t test it in the real world, but you absolutely can do iterative development in simulation, and this really is the only sensible way to develop such tech. Formal verification is about as useful for AGI safety as it is for testing ICBM defense—not much use at all.
I’m not sure how much we are disagreeing here. I’m not proposing anything like formal verification. I think development in simulation is likely to be an important tool in getting it right the first time you go “live”, but I also think there may be other useful general techniques/tools, and that it could be worth investigating them well in advance of need.
Agreed. In particular I think IRL (Inverse Reinforcement Learning) is likely to turn out to be very important. Also, it is likely that the brain has some clever mechanisms for things like value acquisition or IRL, as well as empathy/altruism, and figuring out those mechanisms could be useful.
To clarify, I was not critiquing the idea that we need to get “superintelligence unleashed on the world” correct the first try—that of course I do agree with. I was critiquing the more specific idea that we need to get AGI morality/safety correct the first try.
One could compare to ICBM missile defense systems. The US (and other nations) have developed that tech, and i’ts a case where you have to get the deployed product “right the first try”. However you can’t test it in the real world, but you absolutely can do iterative development in simulation, and this really is the only sensible way to develop such tech. Formal verification is about as useful for AGI safety as it is for testing ICBM defense—not much use at all.
I’m not sure how much we are disagreeing here. I’m not proposing anything like formal verification. I think development in simulation is likely to be an important tool in getting it right the first time you go “live”, but I also think there may be other useful general techniques/tools, and that it could be worth investigating them well in advance of need.
Agreed. In particular I think IRL (Inverse Reinforcement Learning) is likely to turn out to be very important. Also, it is likely that the brain has some clever mechanisms for things like value acquisition or IRL, as well as empathy/altruism, and figuring out those mechanisms could be useful.