Some philosophical problems are specific to AI though, or at least to specific alignment approaches. For example decision theory and logical uncertainty for MIRI’s approach, corrigibility and universality (small core of corrigible and universal reasoning) for Paul’s.
Would you agree that AI accelerates the problem and makes it more urgent, but isn’t the primary source of the problem you’ve identified?
That sounds reasonable but I’m not totally sure what you mean by “primary source”. What would you say is the primary source of the problem?
How would you feel about our chances for a good future if AI didn’t exist (but we still go forward with technological development, presumably reaching space exploration eventually)? Are human safety problems an issue then?
Yeah, sure. I think if AI didn’t exist we’d have a better chance that moral/philosophical progress could keep up with scientific/technological progress but I would still be quite concerned about human safety problems. I’m not sure why you ask this though. What do you think the implications of this are?
What would you say is the primary source of the problem?
The fact that humans don’t generalize well out of distribution, especially on moral questions; and the fact that progress can cause distribution shifts that cause us to fail to achieve our “true values”.
What do you think the implications of this are?
Um, nothing in particular.
I’m not sure why you ask this though.
It’s very hard to understand what people actually mean when they say things, and a good way to check is to formulate an implication of (your model of) their model that they haven’t said explicitly, and then see whether you were correct about that implication.
Ah, I think that all makes sense, but next time I suggest saying something like “to check my understanding” so that I don’t end up wondering what conclusions you might be leading me to. :)
Some philosophical problems are specific to AI though, or at least to specific alignment approaches. For example decision theory and logical uncertainty for MIRI’s approach, corrigibility and universality (small core of corrigible and universal reasoning) for Paul’s.
That sounds reasonable but I’m not totally sure what you mean by “primary source”. What would you say is the primary source of the problem?
Yeah, sure. I think if AI didn’t exist we’d have a better chance that moral/philosophical progress could keep up with scientific/technological progress but I would still be quite concerned about human safety problems. I’m not sure why you ask this though. What do you think the implications of this are?
The fact that humans don’t generalize well out of distribution, especially on moral questions; and the fact that progress can cause distribution shifts that cause us to fail to achieve our “true values”.
Um, nothing in particular.
It’s very hard to understand what people actually mean when they say things, and a good way to check is to formulate an implication of (your model of) their model that they haven’t said explicitly, and then see whether you were correct about that implication.
Ah, I think that all makes sense, but next time I suggest saying something like “to check my understanding” so that I don’t end up wondering what conclusions you might be leading me to. :)