Eliezer thinks that in the alternate world where this is true, GANs pretty much worked the first time they were tried
Note that GANs did in fact pretty much work the first time they were tried, at least according to Ian’s telling, in the strong sense that he had them working on the same night that he came up with the idea over drinks. (That wasn’t a journalist editorializing, that’s the story as he tells it.)
GANs seem to be unstable in just about the ways you’d expect them to be unstable on paper, we don’t have to posit any magical things-are-hard regularity.
This doesn’t feel very important to my broader position. I’m totally comfortable with needing to do a lot of tinkering to get stuff working as long as that work (a) doesn’t increase linearly with the cost of your AI project and (b) can be done in parallel with AI scaling up rather needing to be done at the very end.
There seems to be some basic difference in the way you are thinking about these terms—I’m not sure what you mean by Project Chaos and Software Despair in this case, it seems to me like it would be fine if our experience with alignment was similar to our experience with GANs.
A very important aspect of my objection to Paul here is that I don’t expect weird complicated ideas about recursion to work on the first try
They don’t have to work on the first try. We get to try a whole bunch of stuff in advance to try to get them working, to do tons of experiments and build tons of scaled-down systems for which failure is not catastrophic. The thing that I’m aiming for is: the effort of continuing to scale up our alignment techniques as AI improves is (a) small compared to the effort of scaling up our AI, (b) can be done in parallel with scaling up our AI.
From my perspective, your position is like saying “If you want to build crypto systems that stand up to eavesdroppers with a lot of computational power, then you are going to need to do a lot of extra work.”
My position is like saying “We’ll try to write a library that can do cryptography with arbitrary security parameters. It will take some time to get the library working at all, and then a bunch of extra work the first few times we try to scale it up because we won’t have gotten everything right. But at some point it will actually work. After that, as computers get faster, we’ll just run the same algorithms with bigger and bigger security parameters, and so our communication will remain secure without significant ongoing work.”
It seems clear to me that some kinds of scaleup involve a whole bunch of extra work, and others don’t. Lots of algorithms actually work, and they keep working even if you run them on bigger and bigger inputs. I’ve tried to make arguments for why AI alignment may be more like an algorithmic or conceptually clean task, where we can hope to have a solid solution that scales with AI capabilities. You keep saying that can’t happen and pointing to analogies that don’t seem convincing to me, but it doesn’t feel like you are engaging with the basic argument here.
A bit more quantitatively, I think I’m arguing “>1/3 chance that AI alignment is in the class of tasks that scale well” and you are arguing “>90% chance it isn’t.”
Also note that even though this is a clear disagreement between us, I don’t think it’s a crux for the biggest-picture disagreements. I also have a significant probability on needing lots of ongoing ad hoc work, and so I’m very interested in institutional arrangements such that that’s feasible and doing all of the preparatory research we can to make that easier. If you convinced me 100% on this point, I’d still be pretty far from thinking MIRI’s public position is the right response. (And conversely, if you could convince me that MIRI’s public position is sensible conditioned on this pragmatic pessimistic view, then I have enough probability on the pessimistic view that I’d be basically convinced MIRI’s position is sensible.)
Note that GANs did in fact pretty much work the first time they were tried, at least according to Ian’s telling, in the strong sense that he had them working on the same night that he came up with the idea over drinks. (That wasn’t a journalist editorializing, that’s the story as he tells it.)
GANs seem to be unstable in just about the ways you’d expect them to be unstable on paper, we don’t have to posit any magical things-are-hard regularity.
This doesn’t feel very important to my broader position. I’m totally comfortable with needing to do a lot of tinkering to get stuff working as long as that work (a) doesn’t increase linearly with the cost of your AI project and (b) can be done in parallel with AI scaling up rather needing to be done at the very end.
There seems to be some basic difference in the way you are thinking about these terms—I’m not sure what you mean by Project Chaos and Software Despair in this case, it seems to me like it would be fine if our experience with alignment was similar to our experience with GANs.
They don’t have to work on the first try. We get to try a whole bunch of stuff in advance to try to get them working, to do tons of experiments and build tons of scaled-down systems for which failure is not catastrophic. The thing that I’m aiming for is: the effort of continuing to scale up our alignment techniques as AI improves is (a) small compared to the effort of scaling up our AI, (b) can be done in parallel with scaling up our AI.
From my perspective, your position is like saying “If you want to build crypto systems that stand up to eavesdroppers with a lot of computational power, then you are going to need to do a lot of extra work.”
My position is like saying “We’ll try to write a library that can do cryptography with arbitrary security parameters. It will take some time to get the library working at all, and then a bunch of extra work the first few times we try to scale it up because we won’t have gotten everything right. But at some point it will actually work. After that, as computers get faster, we’ll just run the same algorithms with bigger and bigger security parameters, and so our communication will remain secure without significant ongoing work.”
It seems clear to me that some kinds of scaleup involve a whole bunch of extra work, and others don’t. Lots of algorithms actually work, and they keep working even if you run them on bigger and bigger inputs. I’ve tried to make arguments for why AI alignment may be more like an algorithmic or conceptually clean task, where we can hope to have a solid solution that scales with AI capabilities. You keep saying that can’t happen and pointing to analogies that don’t seem convincing to me, but it doesn’t feel like you are engaging with the basic argument here.
A bit more quantitatively, I think I’m arguing “>1/3 chance that AI alignment is in the class of tasks that scale well” and you are arguing “>90% chance it isn’t.”
Also note that even though this is a clear disagreement between us, I don’t think it’s a crux for the biggest-picture disagreements. I also have a significant probability on needing lots of ongoing ad hoc work, and so I’m very interested in institutional arrangements such that that’s feasible and doing all of the preparatory research we can to make that easier. If you convinced me 100% on this point, I’d still be pretty far from thinking MIRI’s public position is the right response. (And conversely, if you could convince me that MIRI’s public position is sensible conditioned on this pragmatic pessimistic view, then I have enough probability on the pessimistic view that I’d be basically convinced MIRI’s position is sensible.)