It seems like people disagree about how to think about the reasoning process within the advanced AI/AGI, and this is a crux for which research to focus on (please lmk if you disagree). E.g. one may argue AIs are (1) implicitly optimising a utility function (over states of the world), (2) explicitly optimising such a utility function, or (3) is it following a bunch of heuristics or something else?
What can I do to form better views on this? By default, I would read about Shard Theory and “consequentialism” (the AI safety, not the moral philosophy term).
My impression is that basically no one knows how reasoning works, so people either make vague statements (I don’t know what shard theory is supposed to be but when I’ve looked briefly at it it’s either vague or obvious), or retreat to functional descriptions like “the AI follows a policy that acheives high reward” or “the AI is efficient relative to humans” or “the AI pumps outcomes” (see e.g. here: https://www.lesswrong.com/posts/7im8at9PmhbT4JHsW/ngo-and-yudkowsky-on-alignment-difficulty?commentId=5LsHYuXzyKuK3Fbtv ).
It seems like people disagree about how to think about the reasoning process within the advanced AI/AGI, and this is a crux for which research to focus on (please lmk if you disagree). E.g. one may argue AIs are (1) implicitly optimising a utility function (over states of the world), (2) explicitly optimising such a utility function, or (3) is it following a bunch of heuristics or something else?
What can I do to form better views on this? By default, I would read about Shard Theory and “consequentialism” (the AI safety, not the moral philosophy term).
My impression is that basically no one knows how reasoning works, so people either make vague statements (I don’t know what shard theory is supposed to be but when I’ve looked briefly at it it’s either vague or obvious), or retreat to functional descriptions like “the AI follows a policy that acheives high reward” or “the AI is efficient relative to humans” or “the AI pumps outcomes” (see e.g. here: https://www.lesswrong.com/posts/7im8at9PmhbT4JHsW/ngo-and-yudkowsky-on-alignment-difficulty?commentId=5LsHYuXzyKuK3Fbtv ).