ETA: Please don’t spend time on this comment. I no longer think this setup deserves attention. Thanks again!
Thanks so much for the comment! I fully agree.
Suppose we first write the code for “agent X”, and then we wait for a year. We then invoke X with a dataset containing questions that humanity answered in the past year. Then we take “agent Y” (the output of X) and invoke it with a useful question that we don’t know the answer to. If it’s a question that we could find the answer to anyway in the very near future (or perhaps even WOULD have found the answer to in the past year had we were luckier), then it’s plausible we’ll get a useful answer from Y. The larger the dataset is with respect to the final n (the length of the code of the Y created at the last iteration), the less plausible it is for any code of that size to always correctly detect that a given useful question is not from our dataset (which is a necessary condition for not giving us useful answers for any useful question not from the dataset).
P.S., I’ll fully digest your paper soon (I’m in an intense period of finishing my MSc...), but going over it a few months ago (along with other things you wrote on boxed AIs) had a huge and useful impact on me, so thanks for that too! :)
ETA: Please don’t spend time on this comment. I no longer think this setup deserves attention. Thanks again!
Thanks so much for the comment!
I fully agree.
Suppose we first write the code for “agent X”, and then we wait for a year. We then invoke X with a dataset containing questions that humanity answered in the past year. Then we take “agent Y” (the output of X) and invoke it with a useful question that we don’t know the answer to. If it’s a question that we could find the answer to anyway in the very near future (or perhaps even WOULD have found the answer to in the past year had we were luckier), then it’s plausible we’ll get a useful answer from Y. The larger the dataset is with respect to the final n (the length of the code of the Y created at the last iteration), the less plausible it is for any code of that size to always correctly detect that a given useful question is not from our dataset (which is a necessary condition for not giving us useful answers for any useful question not from the dataset).
P.S., I’ll fully digest your paper soon (I’m in an intense period of finishing my MSc...), but going over it a few months ago (along with other things you wrote on boxed AIs) had a huge and useful impact on me, so thanks for that too! :)