“We are currently in a state of regrouping, weighing our options, and searching for plans that we believe may yet have a shot at working[...]Given our increased uncertainty about the best angle of attack, it may turn out to be valuable to house a more diverse portfolio of projects[...]We may commit to an entirely new approach”
To me, this makes it sound like most of their effort is not going towards research similar to the stuff they did prior to 2017, but rather coming up with something totally new, which may or may not address the same sort of problems.
We’re still pursuing work related to agent foundations, embedded agency, etc. We shifted a large amount of our focus onto the “new research directions” in early 2017 (post), and then we wrote a longer explanation of what we were doing and why in 2018 (post). The 2020 strategy update is an update that MIRi’s scaling back work on the “new research directions,” not scaling back work on the set of projects linked to agent foundations.
In the OP I meant ‘agent foundations’ in a broad sense, as in any research aimed at establishing a foundational understanding of agency, so by “giving up on their current approach to agent foundations” I meant that MIRI was shifting away from the approach to agent foundations(broad sense) that they previously saw as most promising. I didn’t mean to imply that MIRI had totally abandoned their research on agent foundations(narrow sense), as in the set of research directions initiated in the agent foundations agenda.
Gotcha. Either way, I think this is a great idea for a thread, and I appreciate you making it. :)
To avoid confusion, when I say “agent foundations” I mean one of these things:
Work that’s oriented toward the original “Agent Foundations” agenda, which put a large focus on “highly reliable agent design” (usually broken up into logical uncertainty and naturalized induction, decision theory, and Vingean reflection), and also tends to apply an HRAD-informed perspective to understanding things like corrigibility and value learning.
Work that’s oriented toward the “Embedded Agency” confusions, which are mostly the same as the original “Agent Foundations” agenda plus subsystem alignment.
We originally introduced the term “agent foundations” because (a) some people (I think Stuart Russell?) thought it was a better way of signposting the kind of alignment research we were doing, and (b) we wanted to distinguish our original research agenda from the 2016 “Alignment for Advanced Machine Learning Systems” agenda (AAMLS).
A better term might have been “agency foundations,” since you almost certainly don’t want your first AGI systems to be “agentic” in every sense of the word, but you do want to fundamentally understand the components of agency (good reasoning, planning, self-modeling, optimization, etc.). The idea is to understand how agency works, but not to actually build a non-task-directed, open-ended optimizer (until you’ve gotten a lot of practice with more limited, easier-to-align AGI systems).
“We are currently in a state of regrouping, weighing our options, and searching for plans that we believe may yet have a shot at working[...]Given our increased uncertainty about the best angle of attack, it may turn out to be valuable to house a more diverse portfolio of projects[...]We may commit to an entirely new approach”
To me, this makes it sound like most of their effort is not going towards research similar to the stuff they did prior to 2017, but rather coming up with something totally new, which may or may not address the same sort of problems.
(I work at MIRI.)
We’re still pursuing work related to agent foundations, embedded agency, etc. We shifted a large amount of our focus onto the “new research directions” in early 2017 (post), and then we wrote a longer explanation of what we were doing and why in 2018 (post). The 2020 strategy update is an update that MIRi’s scaling back work on the “new research directions,” not scaling back work on the set of projects linked to agent foundations.
Thanks for the clairification!
In the OP I meant ‘agent foundations’ in a broad sense, as in any research aimed at establishing a foundational understanding of agency, so by “giving up on their current approach to agent foundations” I meant that MIRI was shifting away from the approach to agent foundations(broad sense) that they previously saw as most promising. I didn’t mean to imply that MIRI had totally abandoned their research on agent foundations(narrow sense), as in the set of research directions initiated in the agent foundations agenda.
Gotcha. Either way, I think this is a great idea for a thread, and I appreciate you making it. :)
To avoid confusion, when I say “agent foundations” I mean one of these things:
Work that’s oriented toward the original “Agent Foundations” agenda, which put a large focus on “highly reliable agent design” (usually broken up into logical uncertainty and naturalized induction, decision theory, and Vingean reflection), and also tends to apply an HRAD-informed perspective to understanding things like corrigibility and value learning.
Work that’s oriented toward the “Embedded Agency” confusions, which are mostly the same as the original “Agent Foundations” agenda plus subsystem alignment.
We originally introduced the term “agent foundations” because (a) some people (I think Stuart Russell?) thought it was a better way of signposting the kind of alignment research we were doing, and (b) we wanted to distinguish our original research agenda from the 2016 “Alignment for Advanced Machine Learning Systems” agenda (AAMLS).
A better term might have been “agency foundations,” since you almost certainly don’t want your first AGI systems to be “agentic” in every sense of the word, but you do want to fundamentally understand the components of agency (good reasoning, planning, self-modeling, optimization, etc.). The idea is to understand how agency works, but not to actually build a non-task-directed, open-ended optimizer (until you’ve gotten a lot of practice with more limited, easier-to-align AGI systems).