This point seems like an argument as an argument in favor of the relevance of the problem laid out in this post. I have other complaints with this framing of the problem, which I expect you would share.
The key distinction between this and contemporary AI is not self-modification, but wanting to have the kind of agent which can look at itself and say, “I know that as new evidence comes in I will change my beliefs. Fortunately, it looks like I’m going to make better decisions as a result” or perhaps even more optimistically “But it looks like I’m not changing them in quite the right way, and I should make this slight change.”
The usual route is to build agents which don’t reason about their own evolution over time. But for sufficiently sophisticated agents, I would expect them to have some understanding of how they will behave in the future, and to e.g. pursue more information based on the explicit belief that by acquiring that information they will enable themselves to make better decisions. This seems like it is a more robust approach to getting the “right” behavior than having an agent which e.g. takes “Information is good” as a brute fact or has a rule for action that bakes in an ad hoc approach to estimating VOI. I think we can all agree that it would not be good to build an AI which calculated the right thing to do, and then did that with probability 99% and took a random action with probability 1%.
That said, even if you are a very sophisticated reasoner, having in hand some heuristics about VOI is likely to be helpful, and if you think that those heuristics are effective you may continue to use them. I just hope that you are using them because you believe they work (e.g. because of empirical observations of them working, the belief that you were intelligently designed to make good decisions, or whatever), not because they are built into your nature.
This point seems like an argument as an argument in favor of the relevance of the problem laid out in this post. I have other complaints with this framing of the problem, which I expect you would share.
The key distinction between this and contemporary AI is not self-modification, but wanting to have the kind of agent which can look at itself and say, “I know that as new evidence comes in I will change my beliefs. Fortunately, it looks like I’m going to make better decisions as a result” or perhaps even more optimistically “But it looks like I’m not changing them in quite the right way, and I should make this slight change.”
The usual route is to build agents which don’t reason about their own evolution over time. But for sufficiently sophisticated agents, I would expect them to have some understanding of how they will behave in the future, and to e.g. pursue more information based on the explicit belief that by acquiring that information they will enable themselves to make better decisions. This seems like it is a more robust approach to getting the “right” behavior than having an agent which e.g. takes “Information is good” as a brute fact or has a rule for action that bakes in an ad hoc approach to estimating VOI. I think we can all agree that it would not be good to build an AI which calculated the right thing to do, and then did that with probability 99% and took a random action with probability 1%.
That said, even if you are a very sophisticated reasoner, having in hand some heuristics about VOI is likely to be helpful, and if you think that those heuristics are effective you may continue to use them. I just hope that you are using them because you believe they work (e.g. because of empirical observations of them working, the belief that you were intelligently designed to make good decisions, or whatever), not because they are built into your nature.