they need to reward outcomes which only they can achieve,
Yep! But this didn’t seem so hard for me to happen, especially in the form of “I pick some easy task (that I can do perfectly), and of course others will also be able to do it perfectly, but since I already have most of the money, if I just keep investing my money in doing it I will reign forever”. You prevent this from happening through epsilon-exploration, or something equivalent like giving money randomly to other traders. These solutions feel bad, but I think they’re the only real solutions. Although I also think stuff about meta-learning (traders explicitly learn about how they should learn, etc.) probably pragmatically helps make these failures less likely.
it should be something which has diminishing marginal return to spending
Yep, that should help (also at the trade-off of making new good ideas slower to implement, but I’m happy to make that trade-off).
But actually I don’t think that this is a “dominant dynamic” because in fact we have a strong tendency to try to pull different ideas and beliefs together into a small set of worldviews
Yeah. To be clear, the dynamic I think is “dominant” is “learning to learn better”. Which I think is not equivalent to simplicity-weighing traders. It is instead equivalent to having some more hierarchichal structure on traders.
Yep! But this didn’t seem so hard for me to happen, especially in the form of “I pick some easy task (that I can do perfectly), and of course others will also be able to do it perfectly, but since I already have most of the money, if I just keep investing my money in doing it I will reign forever”. You prevent this from happening through epsilon-exploration, or something equivalent like giving money randomly to other traders. These solutions feel bad, but I think they’re the only real solutions. Although I also think stuff about meta-learning (traders explicitly learn about how they should learn, etc.) probably pragmatically helps make these failures less likely.
Yep, that should help (also at the trade-off of making new good ideas slower to implement, but I’m happy to make that trade-off).
Yeah. To be clear, the dynamic I think is “dominant” is “learning to learn better”. Which I think is not equivalent to simplicity-weighing traders. It is instead equivalent to having some more hierarchichal structure on traders.