It fits into the “Strategy: train a reporter that is useful for another AI” category, and solves the counterexamples that were proposed in this post (except if I missed sth and it is actually harder to defend against the steganography example, but I think not). (It won $10000.) It also discusses some other possible counterexamples, but not extensively and I haven’t found a very convincing one. (Which does not mean there is no very convincing one, and I’m also not sure if I find the method that promising in practice.)
Overall, perhaps worth reading if you are interested in the “Strategy: train a reporter that is useful for another AI” category.
Btw., a bit late but if people are interested in reading my proposal, it’s here: https://docs.google.com/document/d/1kiFR7_iqvzmqtC_Bmb6jf7L1et0xVV1cCpD7GPOEle0/edit?usp=sharing
It fits into the “Strategy: train a reporter that is useful for another AI” category, and solves the counterexamples that were proposed in this post (except if I missed sth and it is actually harder to defend against the steganography example, but I think not). (It won $10000.) It also discusses some other possible counterexamples, but not extensively and I haven’t found a very convincing one. (Which does not mean there is no very convincing one, and I’m also not sure if I find the method that promising in practice.)
Overall, perhaps worth reading if you are interested in the “Strategy: train a reporter that is useful for another AI” category.