In the OP I meant ‘agent foundations’ in a broad sense, as in any research aimed at establishing a foundational understanding of agency, so by “giving up on their current approach to agent foundations” I meant that MIRI was shifting away from the approach to agent foundations(broad sense) that they previously saw as most promising. I didn’t mean to imply that MIRI had totally abandoned their research on agent foundations(narrow sense), as in the set of research directions initiated in the agent foundations agenda.
Gotcha. Either way, I think this is a great idea for a thread, and I appreciate you making it. :)
To avoid confusion, when I say “agent foundations” I mean one of these things:
Work that’s oriented toward the original “Agent Foundations” agenda, which put a large focus on “highly reliable agent design” (usually broken up into logical uncertainty and naturalized induction, decision theory, and Vingean reflection), and also tends to apply an HRAD-informed perspective to understanding things like corrigibility and value learning.
Work that’s oriented toward the “Embedded Agency” confusions, which are mostly the same as the original “Agent Foundations” agenda plus subsystem alignment.
We originally introduced the term “agent foundations” because (a) some people (I think Stuart Russell?) thought it was a better way of signposting the kind of alignment research we were doing, and (b) we wanted to distinguish our original research agenda from the 2016 “Alignment for Advanced Machine Learning Systems” agenda (AAMLS).
A better term might have been “agency foundations,” since you almost certainly don’t want your first AGI systems to be “agentic” in every sense of the word, but you do want to fundamentally understand the components of agency (good reasoning, planning, self-modeling, optimization, etc.). The idea is to understand how agency works, but not to actually build a non-task-directed, open-ended optimizer (until you’ve gotten a lot of practice with more limited, easier-to-align AGI systems).
Thanks for the clairification!
In the OP I meant ‘agent foundations’ in a broad sense, as in any research aimed at establishing a foundational understanding of agency, so by “giving up on their current approach to agent foundations” I meant that MIRI was shifting away from the approach to agent foundations(broad sense) that they previously saw as most promising. I didn’t mean to imply that MIRI had totally abandoned their research on agent foundations(narrow sense), as in the set of research directions initiated in the agent foundations agenda.
Gotcha. Either way, I think this is a great idea for a thread, and I appreciate you making it. :)
To avoid confusion, when I say “agent foundations” I mean one of these things:
Work that’s oriented toward the original “Agent Foundations” agenda, which put a large focus on “highly reliable agent design” (usually broken up into logical uncertainty and naturalized induction, decision theory, and Vingean reflection), and also tends to apply an HRAD-informed perspective to understanding things like corrigibility and value learning.
Work that’s oriented toward the “Embedded Agency” confusions, which are mostly the same as the original “Agent Foundations” agenda plus subsystem alignment.
We originally introduced the term “agent foundations” because (a) some people (I think Stuart Russell?) thought it was a better way of signposting the kind of alignment research we were doing, and (b) we wanted to distinguish our original research agenda from the 2016 “Alignment for Advanced Machine Learning Systems” agenda (AAMLS).
A better term might have been “agency foundations,” since you almost certainly don’t want your first AGI systems to be “agentic” in every sense of the word, but you do want to fundamentally understand the components of agency (good reasoning, planning, self-modeling, optimization, etc.). The idea is to understand how agency works, but not to actually build a non-task-directed, open-ended optimizer (until you’ve gotten a lot of practice with more limited, easier-to-align AGI systems).