Gianluca Calcagni comments on Instruction-following AGI is easier and more likely than value aligned AGI

Gianluca Calcagni 13 Jul 2024 9:44 UTC
3 points
0
Thanks Seth for your post! I believe I get your point, and in fact I made a post that described exactly that approach. in detail I recommend conditioning the model by using an existing technique called control vectors (or steering vectors), that achieves a raw but incomplete form of safety—in my opinion, just enough partial safety to work on solutioning full safety with the help of AIs.

Of course, I am happy to be challenged.