Indeed, you could imagine extreme scenarios where the smallest circuit instantiates the agents in a blank environment with the message “you are inside a simulation; please provide outputs as you would in environment [X]”. If the agents are good at pretending, this could be quite an accurate predictor.
But can we just take whatever cognitive process the agent uses for pretending, and then leave the rest of it out?
But can we just take whatever cognitive process the agent uses for pretending, and then leave the rest of it out?