I don’t see anything that looks strong enough to defeat the orthogonality thesis, which I see as the claim that it should be possible to design minds in such a way that the part with the utility function is separate from the part which optimizes
There are a number of versions of the OT floating around. That version is not impactive on (u)FAI. We already know it is possible to do dumb and dangerous things. The uFAI argument requires certain failure modes to be inevitable or likely even if the absence of malevolence and incompetence.
There are a number of versions of the OT floating around. That version is not impactive on (u)FAI. We already know it is possible to do dumb and dangerous things. The uFAI argument requires certain failure modes to be inevitable or likely even if the absence of malevolence and incompetence.