(Also, couldn’t we equally point to unsolved subproblems in alignment, or alignment for easier cases, and demand that they be solved before we dare tackle the hard problem?)
I think this is is missing the point somewhat.
When Eliezer and co. talk about tackling “the hard part of the problem”* I believe they are referring trying to solve to the simplest, easiest, problems that capture some part of the core difficulty of alignment.
We’re working on the tiling positions problem because we think that being able to fire a cannonball at a certain instantaneous velocity such that it enters a stable orbit… is the sort of problem that somebody who could really actually launch a rocket through space and have it move in a particular curve that really actually ended with softly landing on the Moon would be able to solve easily. So the fact that we can’t solve it is alarming. If we can figure out how to solve this much simpler, much more crisply stated “tiling positions problem” with imaginary cannonballs on a perfectly spherical earth with no atmosphere, which is a lot easier to analyze than a Moon launch, we might thereby take one more incremental step towards eventually becoming the sort of people who could plot out a Moon launch.
Hence, doing Agent Foundations work that isn’t directly about working with machine learning systems, but is mostly about abstractions of ideal agents, etc.
Similarly, I think that creating and enforcing a global ban on gain of function research captures much of the hard part of causing the world to coordinate not to build AGI. It is an easier task, but you will encounter many of the same blockers, and need to solve many of the same sub-problems.
Creating a global ban on gain of function research : coordinating the world to prevent solving AGI :: solving the tiling agents problem : solving the whole alignment problem.
It’s sad that our Earth couldn’t be one of the more dignified planets that makes a real effort, correctly pinpointing the actual real difficult problems and then allocating thousands of the sort of brilliant kids that our Earth steers into wasting their lives on theoretical physics. But better MIRI’s effort than nothing. What were we supposed to do instead, pick easy irrelevant fake problems that we could make an illusion of progress on, and have nobody out of the human species even try to solve the hard scary real problems, until everybody just fell over dead?
Creating a global ban on gain of function research : coordinating the world to prevent solving AGI :: solving the tiling agents problem : solving the whole alignment problem.
I don’t know when Miri started working on Tiling Agents, but was published in 2013. In retrospect, it seems like we would not have wanted people to wait that long to work on alignment. And it’s especially problematic now that timelines are shorter.
I mean, assume a coordinated effort to ban gain-of-function research succeeds eight years from now; even if we then agree that policy is the way to go, it may be too late.
In retrospect, it seems like we would not have wanted people to wait that long to work on alignment
I don’t buy this characterization. This might sound at odds with my comment above, but working on tiling agents was an attempts at solving alignment, not deferring solving alignment.
The way you solve a thorny, messy, real-world technical problem, is to first solve a easier problem with simplified assumptions, and then gradually add in more complexity.
I agree that this analogizes less tightly to the political action case, because solving the problem of putting a ban on gain of function research is not a strictly necessary step for creating a ban on AI, the way solving Tiling agents is (or at least seemed at the time to be) a necessary step for solving alignment.
I don’t buy this characterization. This might sound at odds with my comment above, but working on tiling agents was an attempts at solving alignment, not deferring solving alignment.
I totally agree. My point was not that tiling agents isn’t alignment research (it definitely is), it’s that the rest of the community wasn’t waiting for that success to start doing stuff.
I think this is is missing the point somewhat.
When Eliezer and co. talk about tackling “the hard part of the problem”* I believe they are referring trying to solve to the simplest, easiest, problems that capture some part of the core difficulty of alignment.
See this fictionalized segment from the rocket alignment problem:
Hence, doing Agent Foundations work that isn’t directly about working with machine learning systems, but is mostly about abstractions of ideal agents, etc.
Similarly, I think that creating and enforcing a global ban on gain of function research captures much of the hard part of causing the world to coordinate not to build AGI. It is an easier task, but you will encounter many of the same blockers, and need to solve many of the same sub-problems.
Creating a global ban on gain of function research : coordinating the world to prevent solving AGI :: solving the tiling agents problem : solving the whole alignment problem.
* For instance, in this paragraph, from here.
I don’t know when Miri started working on Tiling Agents, but was published in 2013. In retrospect, it seems like we would not have wanted people to wait that long to work on alignment. And it’s especially problematic now that timelines are shorter.
I mean, assume a coordinated effort to ban gain-of-function research succeeds eight years from now; even if we then agree that policy is the way to go, it may be too late.
I don’t buy this characterization. This might sound at odds with my comment above, but working on tiling agents was an attempts at solving alignment, not deferring solving alignment.
The way you solve a thorny, messy, real-world technical problem, is to first solve a easier problem with simplified assumptions, and then gradually add in more complexity.
I agree that this analogizes less tightly to the political action case, because solving the problem of putting a ban on gain of function research is not a strictly necessary step for creating a ban on AI, the way solving Tiling agents is (or at least seemed at the time to be) a necessary step for solving alignment.
I totally agree. My point was not that tiling agents isn’t alignment research (it definitely is), it’s that the rest of the community wasn’t waiting for that success to start doing stuff.