I’m interested in doing in-depth dialogues to find cruxes. Message me if you are interested in doing this.
I do alignment research, mostly stuff that is vaguely agent foundations. Currently doing independent alignment research on ontology identification. Formerly on Vivek’s team at MIRI. Most of my writing before mid 2023 is not representative of my current views about alignment difficulty.
It’s not about building less useful technology, that’s not what Abram or Ryan are talking about (I assume). The field of alignment has always been about strongly superhuman agents. You can have tech that is useful and also safe to use, there’s no direct contradiction here.
Maybe one weak-ish historical analogy is explosives? Some explosives are unstable, and will easily explode by accident. Some are extremely stable, and can only be set off by a detonator. Early in the industrial chemistry tech tree, you only have access to one or two ways to make explosives. If you’re desperate, you use these whether or not they are stable, because the risk-usefulness tradeoff is worth it. A bunch of your soldiers will die, and your weapons caches will be easier to destroy, but that’s a cost you might be willing to pay. As your industrial chemistry tech advances, you invent many different types of explosive, and among these choices you find ones that are both stable explosives and effective, because obviously this is better in every way.
Maybe another is medications? As medications advanced, as we gained choice and specificity in medications, we could choose medications that had both low side-effects and were effective. Before that, there was often a choice, and the correct choice was often to not use the medicine unless you were literally dying.
In both these examples, sometimes the safety-usefulness tradeoff was worth it, sometimes not. Presumably people in both cases people often made the choice not to use unsafe explosives or unsafe medicine, because the risk wasn’t worth it.
As it is with these technologies, so it is with AGI. There are a bunch future paradigms of AGI building. The first one we stumble into isn’t looking like one where we can precisely specify what it wants. But if we were able to keep experimenting and understanding and iterating after the first AGI, and we gradually developed dozens of ways of building AGI, then I’m confident we could find one that is just as intelligent and also could have its goals precisely specified.
My two examples above don’t quite answer your question, because “humanity” didn’t steer away from using them, just individual people at particular times. For examples where all or large sections of humanity steered away from using an extremely useful tech whose risks purportedly outweighed benefits: Project Plowshare, nuclear power in some countries, GMO food in some countries, viral bioweapons (as far as I know), eugenics, stem cell research, cloning. Also {CFCs, asbestos, leaded petrol, CO2 to some extent, radium, cocaine, heroin} after the negative externalities were well known.
I guess my point is that safety-usefulness tradeoffs are everywhere, and tech development choices that take into account risks are made all the time. To me, this makes your question utterly confused. Building technology that actually does what you want (which is be safe and useful) is just standard practice. This is what everyone does, all the time, because obviously safety is one of the design requirements of whatever you’re building.
The main difference with between above technologies and AGI is that it’s a trapdoor. The cost of messing up AGI is that you lose any chance to try again. AGI shares with some of the above technologies an epistemic problem. For many of them it isn’t clear in advance, to most people, how much risk there actually is, and therefore whether the tradeoff is worth it.
After writing this, it occurred to me that maybe by “competitive” you meant “earlier in the tech tree”? I interpreted it in my comment as a synonym of “useful” in a sense that excluded safe-to-use.