I do think MIRI “at least temporarily gave up” on personally executing on technical research agendas, or something like that, but, that’s not the only type of output.
So, I’m sure various people have probably thought about this a lot, but just to ask the obvious dumb question: Are we sure that this is even a good idea?
Let’s say the hope is that at some time in the future, we’ll stumble across an Amazing Insight that unblocks progress on AI alignment. At that point, it’s probably good to be able to execute quickly on turning that insight into actual mathematics (and then later actual corrigible AI designs, and then later actual code). It’s very easy for knowledge of “how to do things” to be lost, particularly technical knowledge. [1] Humanity loses this knowledge on a generational timescale, as people die, but it’s possible for institutions to lose knowledge much more quickly due to turnover. All that just to say: Maybe MIRI should keep doing some amount of technical research, just to “stay in practice”.
My general impression here is that there’s plenty of unfinished work in agent foundations and decision theory, things like: How do we actually write a bounded program that implements something like UDT? How do we actually do calculations with logical decision theories such that we can get answers out for basic game-theory scenarios (even something as simple as the ultimatum game is unsolved IIRC)? What are some common-sense constraints the program-value-functions should obey (eg. how should we value a program that simulates multiple other programs?)? These all seem like they are likely to be relevant to alignment, and also intrinsically worth doing.
Agent Foundations research has stuttered a bit over the team going remote and its membership shifting and various other logistical hurdles, but has been continuous throughout.
There’s also at least one other team (the one I provide ops support to) that has been continuous since 2017.
I think the thing Raemon is pointing at is something like “in 2020, both Nate and Eliezer would’ve answered ‘yes’ if asked whether they were regularly spending work hours every day on a direct, technical research agenda; in 2023 they would both answer ‘no.’”
What Duncan said. “MIRI at least temporarily gave up on personally executing on technical research agendas” is false, though a related claim is true: “Nate and Eliezer (who are collectively a major part of MIRI’s research leadership and play a huge role in the org’s strategy-setting) don’t currently see a technical research agenda that’s promising enough for them to want to personally focus on it, or for them to want the organization to make it an overriding priority”.
I do think the “temporarily” and “currently” parts of those statements is quite important: part of why the “MIRI has given up” narrative is silly is that it’s rewriting history to gloss “we don’t know what to do” as “we know what to do, but we don’t want to do it”. We don’t know what to do, but if someone came up with a good idea that we could help with, we’d jump on it!
There are many negative-sounding descriptions of MIRI’s state that I could see an argument for, as stylized narratives (“MIRI doesn’t know what to do”, “MIRI is adrift”, etc.). Somehow, though, people skipped over all those perfectly serviceable pejorative options and went straight for the option that’s definitely just not true?
We don’t know what to do, but if someone came up with a good idea that we could help with, we’d jump on it!
In that case, I think the socially expected behavior is to do some random busywork, to send a clear signal that you not lazy.
(In corporate environment, the usual solution is to organize a few meetings. Not sure what is the equivalent for non-profits… perhaps organizing conferences?)
I don’t see how that would help at all, and pure busywork is silly when you have lots of things to do that are positive-EV but probably low-impact.
MIRI “doesn’t know what to do” in the sense that we don’t see a strategy with macroscopic probability of saving the world, and the most-promising ones with microscopic probability are very diverse and tend to violate or side-step our current models in various ways, such that it’s hard to pick actions that help much with those scenarios as a class.
That’s different from MIRI “not knowing what to do” in the sense of having no ideas for local actions that are worth trying on EV grounds. (Though a lot of these look like encouraging non-MIRI people to try lots of things and build skill and models in ways that might change the strategic situation down the road.)
(Also, I’m mainly trying to describe Nate and Eliezer’s views here. Other MIRI researchers are more optimistic about some of the technical work we’re doing, AFAIK.)
Question about this part:
So, I’m sure various people have probably thought about this a lot, but just to ask the obvious dumb question: Are we sure that this is even a good idea?
Let’s say the hope is that at some time in the future, we’ll stumble across an Amazing Insight that unblocks progress on AI alignment. At that point, it’s probably good to be able to execute quickly on turning that insight into actual mathematics (and then later actual corrigible AI designs, and then later actual code). It’s very easy for knowledge of “how to do things” to be lost, particularly technical knowledge. [1] Humanity loses this knowledge on a generational timescale, as people die, but it’s possible for institutions to lose knowledge much more quickly due to turnover. All that just to say: Maybe MIRI should keep doing some amount of technical research, just to “stay in practice”.
My general impression here is that there’s plenty of unfinished work in agent foundations and decision theory, things like: How do we actually write a bounded program that implements something like UDT? How do we actually do calculations with logical decision theories such that we can get answers out for basic game-theory scenarios (even something as simple as the ultimatum game is unsolved IIRC)? What are some common-sense constraints the program-value-functions should obey (eg. how should we value a program that simulates multiple other programs?)? These all seem like they are likely to be relevant to alignment, and also intrinsically worth doing.
[1] This talk is relevant: https://www.youtube.com/watch?v=ZSRHeXYDLko
Agent Foundations research has stuttered a bit over the team going remote and its membership shifting and various other logistical hurdles, but has been continuous throughout.
There’s also at least one other team (the one I provide ops support to) that has been continuous since 2017.
I think the thing Raemon is pointing at is something like “in 2020, both Nate and Eliezer would’ve answered ‘yes’ if asked whether they were regularly spending work hours every day on a direct, technical research agenda; in 2023 they would both answer ‘no.’”
What Duncan said. “MIRI at least temporarily gave up on personally executing on technical research agendas” is false, though a related claim is true: “Nate and Eliezer (who are collectively a major part of MIRI’s research leadership and play a huge role in the org’s strategy-setting) don’t currently see a technical research agenda that’s promising enough for them to want to personally focus on it, or for them to want the organization to make it an overriding priority”.
I do think the “temporarily” and “currently” parts of those statements is quite important: part of why the “MIRI has given up” narrative is silly is that it’s rewriting history to gloss “we don’t know what to do” as “we know what to do, but we don’t want to do it”. We don’t know what to do, but if someone came up with a good idea that we could help with, we’d jump on it!
There are many negative-sounding descriptions of MIRI’s state that I could see an argument for, as stylized narratives (“MIRI doesn’t know what to do”, “MIRI is adrift”, etc.). Somehow, though, people skipped over all those perfectly serviceable pejorative options and went straight for the option that’s definitely just not true?
In that case, I think the socially expected behavior is to do some random busywork, to send a clear signal that you not lazy.
(In corporate environment, the usual solution is to organize a few meetings. Not sure what is the equivalent for non-profits… perhaps organizing conferences?)
I don’t see how that would help at all, and pure busywork is silly when you have lots of things to do that are positive-EV but probably low-impact.
MIRI “doesn’t know what to do” in the sense that we don’t see a strategy with macroscopic probability of saving the world, and the most-promising ones with microscopic probability are very diverse and tend to violate or side-step our current models in various ways, such that it’s hard to pick actions that help much with those scenarios as a class.
That’s different from MIRI “not knowing what to do” in the sense of having no ideas for local actions that are worth trying on EV grounds. (Though a lot of these look like encouraging non-MIRI people to try lots of things and build skill and models in ways that might change the strategic situation down the road.)
(Also, I’m mainly trying to describe Nate and Eliezer’s views here. Other MIRI researchers are more optimistic about some of the technical work we’re doing, AFAIK.)
Ah got it, thanks for the reply!