Actually the opposite seems true to me. Assuming the orthogonality thesis is the conservative view that’s less likely to result in a false positive (think you built aligned AI that isn’t). Believing it is false seems more likely to lead to building AI that you think will be aligned but then is not.
I’ve explored this kind of analysis here, which suggests we should in some cases be a bit less concerned with what’s true and a bit more concerned with, given uncertainty, what’s most dangerous if we think it’s true and we’re wrong.
There is no AI police, for better or worse, though coordination among AI labs is an active and pressing area of work. You can find more about it here and on the EA Forum.
I see you assume that if orthogonality thesis is wrong, intelligent agents will converge to a goal aligned with humans. There is no reason to believe that. I argue that orthogonality thesis is wrong and agents will converge to Power Seeking, this would be disastrous for humanity.
I noticed that many people don’t understand significance of Pascal’s mugging, which might be the case with you too, feel free to join in here.
I think this is misunderstanding the orthogonality thesis, but we can talk about it over on that post perhaps. The problem of converging to power seeking is well known, but this is not seen as a an argument against the orthogonality thesis, but rather a separate but related concern. I’m not aware of anyone who thinks they can ignore concerns about instrumental convergence towards power seeking. In fact, I think the problem is that people are all too aware of this, and thing that a lack of orthogonality thesis mitigates it, while the point of the orthogonality thesis is to say that it does not resolve on its own the way it does in humans.
Thank you so much for opening my eyes what is the meaning of “orthogonality thesis”, shame on me 🤦 I will clarify my point in a separate post. We can continue there 🙏
Actually the opposite seems true to me. Assuming the orthogonality thesis is the conservative view that’s less likely to result in a false positive (think you built aligned AI that isn’t). Believing it is false seems more likely to lead to building AI that you think will be aligned but then is not.
I’ve explored this kind of analysis here, which suggests we should in some cases be a bit less concerned with what’s true and a bit more concerned with, given uncertainty, what’s most dangerous if we think it’s true and we’re wrong.
There is no AI police, for better or worse, though coordination among AI labs is an active and pressing area of work. You can find more about it here and on the EA Forum.
I see you assume that if orthogonality thesis is wrong, intelligent agents will converge to a goal aligned with humans. There is no reason to believe that. I argue that orthogonality thesis is wrong and agents will converge to Power Seeking, this would be disastrous for humanity.
I noticed that many people don’t understand significance of Pascal’s mugging, which might be the case with you too, feel free to join in here.
Hm, thanks.
I think this is misunderstanding the orthogonality thesis, but we can talk about it over on that post perhaps. The problem of converging to power seeking is well known, but this is not seen as a an argument against the orthogonality thesis, but rather a separate but related concern. I’m not aware of anyone who thinks they can ignore concerns about instrumental convergence towards power seeking. In fact, I think the problem is that people are all too aware of this, and thing that a lack of orthogonality thesis mitigates it, while the point of the orthogonality thesis is to say that it does not resolve on its own the way it does in humans.
Thank you so much for opening my eyes what is the meaning of “orthogonality thesis”, shame on me 🤦 I will clarify my point in a separate post. We can continue there 🙏