“If a model trained on synthetic data is expected to have good performance out of distribution (on real-world problems) then I think that it would also be expected to have high performance at assessing whether it’s in a simulation.”
Noosphere89, you have marked this sentence with a “disagree” emoji. Would you mind expanding on that? I think it is a pretty important point and I’d love to see why you disagree with Ben.
I’m less confident in this position since I put on a disagree emoji, but my reason is because it’s much easier to control an AIs data sources for training than it is for humans, which means it’s quite easy in theory (but might be difficult in practice, which worries me) to censor just enough data such that the model doesn’t even think that it’s likely in a simulation that doesn’t add up to normality.
“If a model trained on synthetic data is expected to have good performance out of distribution (on real-world problems) then I think that it would also be expected to have high performance at assessing whether it’s in a simulation.”
Noosphere89, you have marked this sentence with a “disagree” emoji. Would you mind expanding on that? I think it is a pretty important point and I’d love to see why you disagree with Ben.
I’m less confident in this position since I put on a disagree emoji, but my reason is because it’s much easier to control an AIs data sources for training than it is for humans, which means it’s quite easy in theory (but might be difficult in practice, which worries me) to censor just enough data such that the model doesn’t even think that it’s likely in a simulation that doesn’t add up to normality.