There are plenty of good posts that contradict a “strict” orthogonality thesis by showing correlation between capabilities and various values-related properties (scaling laws / inverse scaling laws).
What really gets you downvoted is the claim that super-intelligent AI cannot want things that are bad for humanity, or even agitating that we should give that idea serious weight.
What also gets you downvoted is the in-between claim that all the scaling laws tend towards superhuman morality and everything will work out fine, no need to be worried or spend lots of hours working.
How to make a successful piece in the latter categories? Simple—just be right, for communcable reasons. Simple, but maybe not possible.
There are plenty of good posts that contradict a “strict” orthogonality thesis by showing correlation between capabilities and various values-related properties (scaling laws / inverse scaling laws).
What really gets you downvoted is the claim that super-intelligent AI cannot want things that are bad for humanity, or even agitating that we should give that idea serious weight.
What also gets you downvoted is the in-between claim that all the scaling laws tend towards superhuman morality and everything will work out fine, no need to be worried or spend lots of hours working.
How to make a successful piece in the latter categories? Simple—just be right, for communcable reasons. Simple, but maybe not possible.