I have quite a different intuition on this, and I’m curious if you have a particular justification for expecting non-simulated training for multi-agent problems.
In certain domains, there are very strong economic incentives to train agents that will act in a real-world multi-agent environment, where the ability to simulate the environment is limited (e.g. trading in stock markets and choosing content for social media users).
In certain domains, there are very strong economic incentives to train agents that will act in a real-world multi-agent environment, where the ability to simulate the environment is limited (e.g. trading in stock markets and choosing content for social media users).