Hmm, I’m not sure how I would distinguish “working backwards from AI doom” from “working forwards from our current situation”. The first part rests on the fact that ML has a particular comparative advantage relative to humans at optimizing for things that are easily measured. That feels very much like working forwards from our current situation.
I guess one aspect that makes it look more like working backwards is that the post is not showing all of the reasoning—there’s a lot of other considerations that go into my thinking on this point, that I’m fairly confident also go into Paul’s thinking, that aren’t in the post because the post would get far too long. For example, one reason for optimism is that whatever would cause an existential failure would also likely cause a non-existential failure before that, which we can learn from. (Though it is controversial both whether we’d get these “warning shots” and whether humanity would actually learn from them.) This reasoning doesn’t explicitly make it into the post, but Paul does respond to this general argument in parts of the post.
Hmm, I’m not sure how I would distinguish “working backwards from AI doom” from “working forwards from our current situation”. The first part rests on the fact that ML has a particular comparative advantage relative to humans at optimizing for things that are easily measured. That feels very much like working forwards from our current situation.
I guess one aspect that makes it look more like working backwards is that the post is not showing all of the reasoning—there’s a lot of other considerations that go into my thinking on this point, that I’m fairly confident also go into Paul’s thinking, that aren’t in the post because the post would get far too long. For example, one reason for optimism is that whatever would cause an existential failure would also likely cause a non-existential failure before that, which we can learn from. (Though it is controversial both whether we’d get these “warning shots” and whether humanity would actually learn from them.) This reasoning doesn’t explicitly make it into the post, but Paul does respond to this general argument in parts of the post.