Hi, I’m the author of that post. My best guess at the moment is that we need a way to calculate “if do(action), then universe pays x”, where “do” notation encapsulates relevant things we don’t know yet about logical uncertainty, like how an AI can separate itself from its logical parent nodes (or its output from its computation) so that it temporarily forgets that its computation maximizes expected utility.
Hi, I’m the author of that post. My best guess at the moment is that we need a way to calculate “if do(action), then universe pays x”, where “do” notation encapsulates relevant things we don’t know yet about logical uncertainty, like how an AI can separate itself from its logical parent nodes (or its output from its computation) so that it temporarily forgets that its computation maximizes expected utility.