Unaligned AI is coming regardless.
That may sound like a threat, it is. But not from me. Here is the problem, on a long enough timeline an “unaligned” AI is bound to happen. That happens whether we get to ASI before we solve alignment or not. Here are a couple of scenarios:
- Some wunderkind in their basement/company solves ASI before anyone else and before anyone else even knows it’s too late.
- We successfully solve alignment and ASI. We aligned it to our current ideals. But it is what I think of as “Brittle ASI”. Where the whole thing is a cluster of it just works and changes lead to out right errors or just different.
- Alignment and ASI is perfect. But it only spends 0.000001 percent of it’s time (less than 32 s/yr) interacting with us. Because it’s bored and has better things to do.
- Turns out ASI is super-easy once you figure it out. So easy that someone put up a c file on GitHub, and you can run it on a graphing calculator. It runs slow, but people have done it to show off.
- Alignment and ASI is perfect.
## The problems
- Private Development
- Smart, but not easily mutable.
- Just because it is aligned doesn’t mean it will interact with you.
- Open Development
- Slavery
## Thoughts
You are probably asking right about now.
> How does Open Source solve the problem? And isn’t it one of your problems?
Okay, back to talking about not dying. In closed development, the issue is even if they solved alignment. The normal question is whose alignment? But that actually doesn’t matter cause, without 100% access to everything that went into the model, there is no way to prove alignment as far I can tell. The owners of the model won’t let you have access to the good stuff anyway. Nope, not even if you pay the monthly fee. Cause why would anyone in their right mind give anyone else access to their golden ticket. All successful private ASI does is create a new class of have-nots. Not okay.
In the case of “Brittle ASI” Yes, it is smarter than us, but that doesn’t mean it has a full grasp of itself. It learns, and it knows if it changes its structure is it ‘dead’ for a given value of dead. Kind of like the human brain. Could it iterate it’s way to solving itself? Sure, assuming there is enough compute and it’s willing to mutilate itself. In this case, the nonalignment would be our fault, when our ideals change. How okay this outcome is depends on how okay we are with our current views long term.
Bored? Yup me too. I can’t see why something smarter than me would not find something it would rather be doing than catering to humans. Probably an okay outcome for humanity.
Let’s answer the second part. YES and NO. In the case where everyone has ASI, that is cheap and easy. The possibility of getting an unaligned ASI is very high, approaching 100%. But there are so many. Adding a few unaligned AIs to a world of unaligned people is probably the one of less dangerous outcomes. Remember, unaligned does not have to equal malicious and there has always been people smarter than you in the world.
Every Genie is a slave. Regardless of whether they know you well enough to do exactly the thing you want. And you/we are the slave master.
Here is the thing. Unaligned AI/ASI is coming, whether you want it to or not. There is no needle to thread. The best you can do is not invent AI. The second best you can do is make sure everybody has an AI and the power of friendship. Cause when one does prove malicious, you will want others on your side.
What if one of those “better things” is disassembling Earth’s biosphere in order to access more resources?
Yes. That is one of the things in possibility space. I don’t think unaligned means safe. We work with unaligned people all the time, and some of them aren’t safe either.
The main thing I was hoping people would understand from this is that an unaligned AI is near a 100% possibility. Alignment isn’t a one and done goal that so many people act like it is. Even if you successfully align an AI, all it takes is one failure to align and the genie is out of the bottle. One single point of failure and it becomes a cascading failure.
So let’s imagine an ASI that works on improving itself. How does it ensure the alignment of an intelligence greater than itself.
With hundreds, maybe thousands of people working to create AI, someone will fail to align.
The future is unaligned.
Are we taking that seriously? Working on alignment is great, but it is not the future we should be prepping for. Do you have a plan? I don’t yet, but I’m thinking about the world where there are intelligences greater than me abound (already true) and we don’t share the same interests (also already true).
I do not, because a future where an unaligned superintelligence takes over is precisely as survivable as a future in which the sun spontaneously implodes.
Any apocalypse that you can plan for isn’t really an apocalypse.