What’s going on when you try to model yourself thinking about the answer to this question?
If a system is analyzing (itself analyzing (itself analyzing (...))) , not realizing it’s doing so, I suspect that it will come up with some best guess answer, but that answer will be ill-determined and dependent on implementation details. Thus a better approach would be to avoid asking self-unaware systems any question that requires that type of analysis!
For example, you can ask “Please output the least improbable scenario, according to your predictive world-model, wherein a cure for Alzheimer’s is invented by a group of people with no access to any AI oracles!” Or even ask it to do counterfactual reasoning about what might happen in a world in which there are no AIs whatsoever. (copied from my post here). This type of question is nice for other reasons too—we’re asking the system to guess what normal humans might plausibly do in the natural course of events, and thus we’ll more typically get normal-human-type solutions to our problems as opposed to bizarre alien human-unfriendly solutions.
If a system is analyzing (itself analyzing (itself analyzing (...))) , not realizing it’s doing so, I suspect that it will come up with some best guess answer, but that answer will be ill-determined and dependent on implementation details. Thus a better approach would be to avoid asking self-unaware systems any question that requires that type of analysis!
For example, you can ask “Please output the least improbable scenario, according to your predictive world-model, wherein a cure for Alzheimer’s is invented by a group of people with no access to any AI oracles!” Or even ask it to do counterfactual reasoning about what might happen in a world in which there are no AIs whatsoever. (copied from my post here). This type of question is nice for other reasons too—we’re asking the system to guess what normal humans might plausibly do in the natural course of events, and thus we’ll more typically get normal-human-type solutions to our problems as opposed to bizarre alien human-unfriendly solutions.