I think this comment isn’t rigorous enough for Noosphere89 to retract his comment this one responds to, but that’s up to him.
Claims of the form “Yudkowsky was wrong about things like mind-design space, the architecture of neural networks (specifically how he thought making large generalizations about the structure of the human brain wouldn’t work for designing neural architectures), and in general, probably his tendency to assume that certain abstractions just don’t apply whenever intelligence or capability is scaled way up.” I think have been argued well enough by now that they have at least some merit to them.
The claim about AI boxing I’m not sure about, but my understanding is that it’s currently being debated (somewhat hotly). [Fill in the necessary details where this comment leads a void, but I think this is mainly about GPT-4′s API and it being embedded into apps where it can execute code on its own and things like that.]
Claims of the form “Yudkowsky was wrong about things like mind-design space, the architecture of neural networks (specifically how he thought making large generalizations about the structure of the human brain wouldn’t work for designing neural architectures), and in general, probably his tendency to assume that certain abstractions just don’t apply whenever intelligence or capability is scaled way up.”
This is what I was gesturing at in my comments.
The claim about AI boxing I’m not sure about, but my understanding is that it’s currently being debated (somewhat hotly).
I’m talking about simboxing, which was shown to work by Jacob Cannell here:
Basically as long as we can manipulate their perception of reality, which is trivial to do in offline learning, then it’s easy to recreate a finite time Cartesian agent, where data only passes through approved channels, then the AI updates it’s state to learn new things, ad infinitum until the end of offline learning.
Thus simboxing is achieved.
The reason I retracted my comment is because of this quote was correct:
Of course plenty of more recent content on LessWrong operates on the background assumption that AGI is going to be a big deal, in large part because the arguments to that effect are quite strong and the arguments against are not. It is at the same time untrue that those arguments don’t exist on LessWrong.
Primarily because of the post below. There are some caveats to this, but this largely goes through.
I think this comment isn’t rigorous enough for Noosphere89 to retract his comment this one responds to, but that’s up to him.
Claims of the form “Yudkowsky was wrong about things like mind-design space, the architecture of neural networks (specifically how he thought making large generalizations about the structure of the human brain wouldn’t work for designing neural architectures), and in general, probably his tendency to assume that certain abstractions just don’t apply whenever intelligence or capability is scaled way up.” I think have been argued well enough by now that they have at least some merit to them.
The claim about AI boxing I’m not sure about, but my understanding is that it’s currently being debated (somewhat hotly). [Fill in the necessary details where this comment leads a void, but I think this is mainly about GPT-4′s API and it being embedded into apps where it can execute code on its own and things like that.]
This is what I was gesturing at in my comments.
I’m talking about simboxing, which was shown to work by Jacob Cannell here:
https://www.lesswrong.com/posts/WKGZBCYAbZ6WGsKHc/love-in-a-simbox-is-all-you-need
Basically as long as we can manipulate their perception of reality, which is trivial to do in offline learning, then it’s easy to recreate a finite time Cartesian agent, where data only passes through approved channels, then the AI updates it’s state to learn new things, ad infinitum until the end of offline learning.
Thus simboxing is achieved.
The reason I retracted my comment is because of this quote was correct:
Primarily because of the post below. There are some caveats to this, but this largely goes through.
Post below:
https://www.lesswrong.com/posts/3nMpdmt8LrzxQnkGp/ai-timelines-via-cumulative-optimization-power-less-long