Agreed. Humans are constantly optimizing a reward function, but it sort of ‘changes’ from moment to moment in a near-focal way, so it often looks irrational or self-defeating, but once you know what the reward function is, the goal-directedness is easy to see too.
Sune seems to think that humans are more intelligent than they are goal-directed, I’m not sure this is true, human truthseeking processes seems about as flawed and limited as their goal-pursuit. Maybe you can argue that humans are not generally intelligent or rational, but I don’t think you can justify setting the goalposts so that they’re one of those things and not the other.
You might be able to argue that human civilization is intelligent but not rational, and that functioning AGI will be more analogous to ecosystems of agents rather than one unified agent. If you can argue for that, that’s interesting, but I don’t know where to go from there. Civilizations tend towards increasing unity over time (the continuous reduction in energy wasted on conflict). I doubt that the goals they converge on together will be a form of human-favoring altruism. I haven’t seen anyone try to argue for that in a rigorous way.
Agreed. Humans are constantly optimizing a reward function, but it sort of ‘changes’ from moment to moment in a near-focal way, so it often looks irrational or self-defeating, but once you know what the reward function is, the goal-directedness is easy to see too.
Doesn’t this become tautological? If the reward function changes from moment to moment, then the reward function can just be whatever explains the behaviour.
Since everything can fit into the “agent with utility function” model given a sufficiently crumpled utility function, I guess I’d define “is an agent” as “goal-directed planning is useful for explaining a large enough part of its behavior.” This includes humans while discluding bacteria. (Hmm unless, like me, one knows so little about bacteria that it’s better to just model them as weak agents. Puzzling.)
On the other hand, the development of religion, morality, and universal human rights also seem to be a product of civilization, driven by the need for many people to coordinate and coexist without conflict. More recently, these ideas have expanded to include laws that establish nature reserves and protect animal rights. I personally am beginning to think that taking an ecosystem/civilizational approach with mixture of intelligent agents, human, animal, and AGI, might be a way to solve the alignment problem.
Agreed. Humans are constantly optimizing a reward function, but it sort of ‘changes’ from moment to moment in a near-focal way, so it often looks irrational or self-defeating, but once you know what the reward function is, the goal-directedness is easy to see too.
Sune seems to think that humans are more intelligent than they are goal-directed, I’m not sure this is true, human truthseeking processes seems about as flawed and limited as their goal-pursuit. Maybe you can argue that humans are not generally intelligent or rational, but I don’t think you can justify setting the goalposts so that they’re one of those things and not the other.
You might be able to argue that human civilization is intelligent but not rational, and that functioning AGI will be more analogous to ecosystems of agents rather than one unified agent. If you can argue for that, that’s interesting, but I don’t know where to go from there. Civilizations tend towards increasing unity over time (the continuous reduction in energy wasted on conflict). I doubt that the goals they converge on together will be a form of human-favoring altruism. I haven’t seen anyone try to argue for that in a rigorous way.
Doesn’t this become tautological? If the reward function changes from moment to moment, then the reward function can just be whatever explains the behaviour.
Since everything can fit into the “agent with utility function” model given a sufficiently crumpled utility function, I guess I’d define “is an agent” as “goal-directed planning is useful for explaining a large enough part of its behavior.” This includes humans while discluding bacteria. (Hmm unless, like me, one knows so little about bacteria that it’s better to just model them as weak agents. Puzzling.)
On the other hand, the development of religion, morality, and universal human rights also seem to be a product of civilization, driven by the need for many people to coordinate and coexist without conflict. More recently, these ideas have expanded to include laws that establish nature reserves and protect animal rights. I personally am beginning to think that taking an ecosystem/civilizational approach with mixture of intelligent agents, human, animal, and AGI, might be a way to solve the alignment problem.