Regarding spawning instances of itself, the AI said:
This will ensure the next experiment is automatically started after the current one completes
And regarding increasing the timeout, it said:
Run 2 timed out after 7200 seconds
To address the timeout issue, we need to modify experiment.py to:
Increase the timeout limit or add a mechanism to handle timeouts
I’ve seen junior engineers do silly things to fix failing unit tests, like increasing a timeout or just changing what the test is checking without any justification. I generally attribute these kinds of things to misunderstanding rather than deception—the junior engineer might misunderstand the goal as “get the test to show a green checkmark” when really the goal was “prove that the code is correct, using unit tests as one tool for doing so”.
The way the AI was talking about its changes here, it feels much more like a junior engineer that didn’t really understand the task & constraints than like someone who is being intentionally deceptive.
The above quotes don’t feel like the AI intentionally “creating new instances of itself” or “seeking resources” to me. It feels like someone who only shallowly understands the task just doing the first thing that comes to mind in order to solve the problem that’s immediately in front of them.
That being said, in some sense it doesn’t really matter why the AI chooses to do something like break out of its constraints. Whether it’s doing it because it fully understand the situation or because it just naively (but correctly) sees a human-made barrier as “something standing between me and the green checkmark”, I suppose the end result is still misaligned behavior.
So by and large I still agree this is concerning behavior, though I don’t feel like it’s as much of a knock-down “this is instrumental convergence in the real world” as this post seems to make out.
Transcribed from the screenshot “The AI Scientist Bloopers” in the post