My experience reading the post was that I kinda nodded along without explicitly identifying and interrogating cruxes. I’m glad that Mauricio has pointed out the crux of “how likely is human civilization to value suffering/torture”. Another crux is “assuming some expectation about how much humans value suffering, how likely are we to get a world with lots of suffering, assuming aligned ai”, another is “who is in control if we get aligned AI”, another is “how good is the good that could come from aligned ai and how likely is it”.
In effect this post seems to argue “because humans have a history of producing lots of suffering, getting an AI aligned to human intent would produce an immense amount of suffering, so much that rolling the dice is worse than extinction with certainty”
It matters what the probabilities are, and it matters what the goods and bars are, but this post doesn’t seem to argue very convincingly that extremely-bads are all that likely (see Mauricio’s bullet points).
I first want to signal-boost Mauricio’s comment.
My experience reading the post was that I kinda nodded along without explicitly identifying and interrogating cruxes. I’m glad that Mauricio has pointed out the crux of “how likely is human civilization to value suffering/torture”. Another crux is “assuming some expectation about how much humans value suffering, how likely are we to get a world with lots of suffering, assuming aligned ai”, another is “who is in control if we get aligned AI”, another is “how good is the good that could come from aligned ai and how likely is it”.
In effect this post seems to argue “because humans have a history of producing lots of suffering, getting an AI aligned to human intent would produce an immense amount of suffering, so much that rolling the dice is worse than extinction with certainty”
It matters what the probabilities are, and it matters what the goods and bars are, but this post doesn’t seem to argue very convincingly that extremely-bads are all that likely (see Mauricio’s bullet points).