Hot take: I would say that most optimization failures I’ve observed in myself and in others (in alignment and elsewhere) boil down to psychological problems.
This is likely true. Note that there is an asymmetry between type 1 and type 2 errors in cooperative optimization, though. The difference between “ok” and “great” is often smaller than the difference between “ok” and “get defected against when I cooperated”. In other words, some things that seem like a prisoners’ dilemma actually ARE risky.
Cool! I like this as an example of difficult-to-notice problems are generally left unsolved. I’m not sure how serious this one is, though.
Hot take: I would say that most optimization failures I’ve observed in myself and in others (in alignment and elsewhere) boil down to psychological problems.
This is likely true. Note that there is an asymmetry between type 1 and type 2 errors in cooperative optimization, though. The difference between “ok” and “great” is often smaller than the difference between “ok” and “get defected against when I cooperated”. In other words, some things that seem like a prisoners’ dilemma actually ARE risky.