Vladimir_Nesov comments on Agentized LLMs will change the alignment landscape

Vladimir_Nesov 9 Apr 2023 14:05 UTC
8 points
4

If GPT-4 has the raw capability to orient itself to reality and navigate it, it should be able to do that with even bare-bones self-prompt/prompted self-reflection ability.

GPT-4 by itself can’t learn, can’t improve its intuitions and skills in response to new facts of the situations of its instances (that don’t fit in its context). So the details of how the prosthetics that compensate for that are implemented (or guided in the way they are to be implemented) can well be crucial.

And also, at some point there will be open sourced pre-trained and RLAIFed models of sufficient scale that allow fine-tuning, that can improve their intuitions, at which point running them inside an improved Auto-GPT successor might be more effective than starting the process from scratch, lowering the minimum necessary scale of the pre-trained foundational model.

Which increases the chances that first AGIs are less intelligent than they would need to be otherwise. Which is bad for their ability to do better than humans at not building intentionally misaligned AGIs the first chance they get.