I mean “do something incoherent at any given moment” is also perfectly agent-y behavior. Babies are agents, too.
I think the problem is modelling incoherent AI is even harder than modelling coherent AI, so most alignment researchers just hope that AI researchers will be able to build coherence in before there is a takeoff, so that they can base their own theories on the assumption that the AI is already coherent.
I find that view overly optimistic. I expect that AI is going to remain incoherent until long after it has become superintelligent.
I mean “do something incoherent at any given moment” is also perfectly agent-y behavior. Babies are agents, too.
I think the problem is modelling incoherent AI is even harder than modelling coherent AI, so most alignment researchers just hope that AI researchers will be able to build coherence in before there is a takeoff, so that they can base their own theories on the assumption that the AI is already coherent.
I find that view overly optimistic. I expect that AI is going to remain incoherent until long after it has become superintelligent.