RogerDearnaley comments on I Would Have Solved Alignment, But I Was Worried That Would Advance Timelines

RogerDearnaley 12 Nov 2023 5:33 UTC
5 points
1
I am an engineer. That means, if I see a problem, I run towards the fire and try to figure out how to put it out, and then ensure it doesn’t start again. (Plus, for this particular fire, Mars probably isn’t far enough to run if you were going to try to run away from it.) The fire in AI is coming from scaling labs rapidly increasing capabilities, and the possibility they might continue to do so at a rate faster than alignment can keep up. All of the major labs are headed by people who appear, from their thoughtful public statements, to be aware of and fairly cautious about AI x-risk (admittedly less cautious than Eliezer Yudkowsky, but more so than, say, Yann LeCun, let alone Mark Andreesson) — to the point that they have gone to some lengths to get world governments to regulate their industry (not normal behavior for captains of industry, and the people dismissing this as “regulatory capture” haven’t explained why industry would go out of their way to get regulated in the first place: regulatory capture is process that gets attempted after a robust regulatory regime already exists).
I think there is a shortage of “what success looks like” stories on LW about solving alignment. We need to a) solve alignment (which is almost certainly easier with experimental access large amounts of compute and not-fully-aligned versions of frontier models than without these), b) get the solution adopted by all the frontier labs, and c) ensure that no-one else later builds an equally powerful unaligned model. I’m dubious that a) can be done by a small group of people in Berkeley or London carefully not talking to anyone else, but I’m very certain that b) and c) can’t.