Indeed, if this were guaranteed to be the case for all agents… then we wouldn’t have to worry about humans building unaligned agents more powerful than themselves. We’d realize that was a bad idea and simple not do it. Is that… what you’d like the gamble everything on? Or maybe… agents can do foolish things sometimes.
Indeed, if this were guaranteed to be the case for all agents… then we wouldn’t have to worry about humans building unaligned agents more powerful than themselves. We’d realize that was a bad idea and simple not do it. Is that… what you’d like the gamble everything on? Or maybe… agents can do foolish things sometimes.