I think that solving the alignment for EV maximizers is a much stronger version of alignment than eg prosaic alignment of LLM-type models.
Agents seem like they’ll be more powerful than Tool AIs. We don’t know how to make them, but if someone does, and capabilities timelines shorten drastically, it would be awesome to even have a theory of EV maximizer alignment before then
I think that solving the alignment for EV maximizers is a much stronger version of alignment than eg prosaic alignment of LLM-type models. Agents seem like they’ll be more powerful than Tool AIs. We don’t know how to make them, but if someone does, and capabilities timelines shorten drastically, it would be awesome to even have a theory of EV maximizer alignment before then
Reinforcement learning does create agents, those agents just aren’t expected utility maximisers.
Claims that expected utility maximisation is the ideal or limit of agency seem wrong.
I think expected utility maximisation is probably anti-natural to generally capable optimisers.