Great, let’s talk about whether proposed problems are on their way towards being solved. I much prefer that framing and I would not have objected so strongly if that’s what you had said. E.g. suppose you had said “Hey, why don’t we just prompt AutoGPT-5 with lots of corrigibility instructions?” then we could have a more technical conversation about whether or not that’ll work, and the answer is probably no, BUT I do agree that this is looking promising relative to e.g. the alternative world where we train powerful alien agents in various video games and simulations and then try to teach them English. (I say more about this elsewhere in this conversation, for those just tuning in!)
Great, let’s talk about whether proposed problems are on their way towards being solved. I much prefer that framing and I would not have objected so strongly if that’s what you had said. E.g. suppose you had said “Hey, why don’t we just prompt AutoGPT-5 with lots of corrigibility instructions?” then we could have a more technical conversation about whether or not that’ll work, and the answer is probably no, BUT I do agree that this is looking promising relative to e.g. the alternative world where we train powerful alien agents in various video games and simulations and then try to teach them English. (I say more about this elsewhere in this conversation, for those just tuning in!)