One is decisions involved in designing/improving AI systems, like in the scenario above. The other, which I talked about in an earlier comment, is ethical disasters directly caused by people who are not uncertain, but just wrong. You didn’t reply to that comment, so I’m not sure why you’re unconcerned about this category either.
I replied to your earlier comment.
My overall feeling is still that these are separate problems. We can evaluate a solution to AI control, and we can evaluate philosophical work that improves our understanding of potentially-relevant issues (or metaphilosophical work to automate philosophy).
I am both less pessimistic about philosophical errors doing damage, and more optimistic about my scheme’s ability to do philosophy, but it’s not clear to me that either of those is the real disagreement (since if I imagining caring a lot about philosophy and thinking this scheme didn’t help automate philosophy, I would still feel like we were facing two distinct problems).
I replied to your earlier comment.
My overall feeling is still that these are separate problems. We can evaluate a solution to AI control, and we can evaluate philosophical work that improves our understanding of potentially-relevant issues (or metaphilosophical work to automate philosophy).
I am both less pessimistic about philosophical errors doing damage, and more optimistic about my scheme’s ability to do philosophy, but it’s not clear to me that either of those is the real disagreement (since if I imagining caring a lot about philosophy and thinking this scheme didn’t help automate philosophy, I would still feel like we were facing two distinct problems).