I think doing the AI version (bots and/or LLMs) makes sense as a starting point, then we should be able to add the human versions later if we want. I think it’s fine for the thing anchoring it to human performance is to be comparison of performance compared to humans playing against the same opponents, not literally playing against humans.
One thing is that tasks where there’s a lot of uncertainty about what exactly the setup is and what distribution the opponents / black box functions / etc are drawn from, this can be unhelpfully high-variance—in the sense that the agent’s score depends really heavily on its assumptions about what the other agents are and what they will do, rather than only measuring capability. So I think it’s a good idea to give the agent reasonable information about the distribution of opponents, even if you still include uncertainty.
I have some code for setting up a simple black box game inside our infra that you could adapt for this if that’s useful. In general I think the structure of starting a server on localhost that implements the game and then telling the agent in the prompt that it needs to send queries to that server works well if you want the agent to interact with some program without being able to see / edit it. I think open-source versions could also be interesting, where you tell the model more about the opponents including the prompts for the other models or the source code, and see how well it can use that information to perform better.
I think doing the AI version (bots and/or LLMs) makes sense as a starting point, then we should be able to add the human versions later if we want. I think it’s fine for the thing anchoring it to human performance is to be comparison of performance compared to humans playing against the same opponents, not literally playing against humans.
One thing is that tasks where there’s a lot of uncertainty about what exactly the setup is and what distribution the opponents / black box functions / etc are drawn from, this can be unhelpfully high-variance—in the sense that the agent’s score depends really heavily on its assumptions about what the other agents are and what they will do, rather than only measuring capability. So I think it’s a good idea to give the agent reasonable information about the distribution of opponents, even if you still include uncertainty.
I have some code for setting up a simple black box game inside our infra that you could adapt for this if that’s useful. In general I think the structure of starting a server on localhost that implements the game and then telling the agent in the prompt that it needs to send queries to that server works well if you want the agent to interact with some program without being able to see / edit it.
I think open-source versions could also be interesting, where you tell the model more about the opponents including the prompts for the other models or the source code, and see how well it can use that information to perform better.