Yes, I just did confirm that even turning Code Interpreter on does not seem to help with recognition of a winning move at Tic-Tac-Toe (even when I tell it to play like a Tic-Tac-Toe expert). Although, it did not try to generate and run any Python (perhaps, it needs to be additionally pushed towards doing that).
A more sophisticated prompt engineering might do it, but it does not work well enough on its own on this task.
Returning to “artificial researchers based on LLMs”, I would expect the need for more sophisticated prompts, not just reference to a person, but some set of technical texts and examples of reasoning to focus on (and learning to generate better long prompts of this kind would be a part of self-improvement, although I would expect the bulk of self-improvement to come from designing smarter relatively compact neural machines interfacing with LLMs and smarter schemes of connectivity between them and LLMs (I expect an LLM in question to be open and not hidden by an opaque API, so that one would be able to read from any layer/inject into any layer)).
One can make all sorts of guesses but based on the evidence so far, AIs have a different skill profile than humans. This means if we think of any job a which requires a large set of skills, then for a long period of time, even if AIs beat the human average in some of them, they will perform worse than humans in others.
Yes, at least that’s the hope (that there will be need for joint teams and for finding some mutual accommodation and perhaps long-term mutual interest between them and us; basically, the hope that Copilot-style architecture will be essential for long time to come)...
Yes, I just did confirm that even turning Code Interpreter on does not seem to help with recognition of a winning move at Tic-Tac-Toe (even when I tell it to play like a Tic-Tac-Toe expert). Although, it did not try to generate and run any Python (perhaps, it needs to be additionally pushed towards doing that).
A more sophisticated prompt engineering might do it, but it does not work well enough on its own on this task.
Returning to “artificial researchers based on LLMs”, I would expect the need for more sophisticated prompts, not just reference to a person, but some set of technical texts and examples of reasoning to focus on (and learning to generate better long prompts of this kind would be a part of self-improvement, although I would expect the bulk of self-improvement to come from designing smarter relatively compact neural machines interfacing with LLMs and smarter schemes of connectivity between them and LLMs (I expect an LLM in question to be open and not hidden by an opaque API, so that one would be able to read from any layer/inject into any layer)).
One can make all sorts of guesses but based on the evidence so far, AIs have a different skill profile than humans. This means if we think of any job a which requires a large set of skills, then for a long period of time, even if AIs beat the human average in some of them, they will perform worse than humans in others.
Yes, at least that’s the hope (that there will be need for joint teams and for finding some mutual accommodation and perhaps long-term mutual interest between them and us; basically, the hope that Copilot-style architecture will be essential for long time to come)...