In the models making the news and scaring people now, there aren’t identified separate models for modeling the world and seeking the goal. It’s all inscrutible model weights. Maybe if we understood those weights better we could separate them out. But maybe we couldn’t. Maybe it’s all a big jumble as actually implemented. That would make it incoherent to speak about the relative intelligence of the world model and the goal seeker. So how would this line of thinking apply to that?
In the models making the news and scaring people now, there aren’t identified separate models for modeling the world and seeking the goal. It’s all inscrutible model weights. Maybe if we understood those weights better we could separate them out. But maybe we couldn’t. Maybe it’s all a big jumble as actually implemented. That would make it incoherent to speak about the relative intelligence of the world model and the goal seeker. So how would this line of thinking apply to that?