I am aware that Yudkowsky considers the two main aspects of FAI theory to be a) formalising the maths required for an agent to self-modify without losing its values and b) being able to correctly infer optimal values from humans. These two aims seem quite separate from most work in narrow AI, which involve optimising for a single task.
I am aware that Yudkowsky considers the two main aspects of FAI theory to be a) formalising the maths required for an agent to self-modify without losing its values and b) being able to correctly infer optimal values from humans. These two aims seem quite separate from most work in narrow AI, which involve optimising for a single task.