Yeah, they work well enough at this (~human) level. But no current alignment techniques are scalable to superhuman AI. I’m worried that basically all of the doom flows through an asymptote of imperfect alignment. I can’t see how this doesn’t happen, short of some “miracle”.
Greg C
the tiniest advantage compounds until one party has an overwhelming lead.
This, but x1000 to what you are thinking. I don’t think we have any realistic chance of approximate parity between the first and second movers. The speed that the first mover will be thinking makes this so. Say GPT-6 is smarter at everything, even by a little bit, compared to everything else on the planet (humans, other AIs). It’s copied itself 1000 times, and each copy is thinking 10,000,000 times faster than a human. We will essentially be like rocks to it, operating on geological time periods. It can work out how to disassemble our environment (including an unfathomable number of contingencies against counter strike) over subjective decades or centuries of human-equivalent thinking time before your sentinal AI protectors even pick up it’s activity.
I also think that compared with other AIs, LLMs may have more potential for being raised friendly and collaborative, as we can interact with them the way we do with humans, reusing known recipes. Compared with other forms of extremely large neural nets and machine learning, they are more transparent and accessible. Of all the routes to AGI we could take, I think this might be one of the better ones.
This is an illusion. We are prone to anthropomorphise chatbots. Under the hood they are completely alien. Lovecraftian monsters, only made of tons of inscrutable linear algebra. We are facing a digital alien invasion, that will ultimately move at speeds we can’t begin to keep up with.
Ultimately, it doesn’t matter which monkey gets the poison banana. We’re all dead either way. This is much worse than nukes, in that we really can’t risk even one (intelligence) explosion.
We can but hope they will see sense (as will the US government—and it’s worth considering that in hindsight, maybe they were actually the baddies when it came to nuclear escalation). There is an iceberg on the horizon. It’s not the time to be fighting over revenue from deckchair rentals, or who gets to specify their arrangement. There’s geopolitical recklessness, and there’s suicide. Putin and Xi aren’t suicidal.
Look, I agree re “negative of entropy, aging, dictators killing us eventually”, and a chance of positive outcome, but right now I think the balance is approximately like the above payoff matrix over the next 5-10 years, without a global moratorium (i.e. the positive outcome is very unlikely unless we take a decade or two to pause and think/work on alignment). I’d love to live in something akin to Iain M Banks’ culture, but we need to get through this acute risk period first, to stand any chance of that.
Do you think Drexler’s CAIS is straightforwardly controllable? Why? What’s to stop it being amalgamated into more powerful, less controllable systems? “People” don’t need to make them globally agentic. That can happen automatically via Basic AI Drives and Mesaoptimisation once thresholds in optimisation power are reached.
I’m worried that actually, Alignment might well turn out to be impossible. Maybe a moratorium will allow for such impossibility proofs to be established. What then?
Except the risk of igniting the atmosphere with the Trinity test is judged to be ~10%. It’s not “you slow down, and let us win”, it’s “we all slow down, or we all die”. This is not a Prisoners Dilema:
From the GPT-4 announcement: “We’ve also been using GPT-4 internally, with great impact on functions like support, sales, content moderation, and programming.” (and I’m making the reasonable assumption that they will naturally be working on GPT-5 after GPT-4).
I think we are already too close for comfort to x-risky systems. GPT-4 is being used to speed up development of GPT-5 already. If GPT-5 can make GPT-6, that’s game over. How confident are you that this couldn’t happen?
GPT-4 was rushed, and the OpenAI Plugin store. Things are moving far too fast for comfort. I think we can forgive this response for being rushed. It’s good to have some significant opposition working on the brakes to the runaway existential catastrophe train that we’ve all been put on.
Why do you think it only applies to the US? It applies to the whole world. It says “all AI labs”, and “govenrments”. I hope the top signatories are reaching out to labs in China and other countries. And the UN for that matter. There’s no reason why they wouldn’t also agree. We need a global moratorium on AGI.
Here’s a (failure?) mode that I and others are already in, but might be too embarrassed to write about: taking weird career/financial risks, in order to obtain the financial security, to work on alignment full-time...
I’d be more glad if I saw non-academic noob-friendly programs that pay people, with little legible evidence of their abilities, to upskill full-time.CEEALAR offers this (free accommodation and food, and a moderate stipend), and was set up to avoid the failure mode mentioned (not just for alignment, for EA in general).
This is very cool! For archiving and rebuilding after a global catastrophe, how easy would this be to port to Kiwix for reading on a phone? My thinking is that if a few hundred LWers/EAs have this offline on their phones, that could go quite a long way. Burying phones with it on could also be good as a low hanging fruit (ideally you need a way of reading the data to be stored with the data). Happy to fund this if anyone wants to do it.
No I mean links to him in person to talk to him (or for that matter, even an email address or any way of contacting him..).
Oh wow, didn’t realise how recent the Huawei recruitment of Field medalists was! This from today. Maybe we need to convince Huawei to care about AGI Alignment :)
Should also say—good that you are thinking about it P., and thanks for a couple of the links which I hadn’t seen before.
Maybe reaching Demis Hassabis first is the way to go though, given that he’s already thinking about it, and has already mentioned it to Tao (according to the podcast). Does anyone have links to Demis? Would be good to know more about his “Avengers assemble” plan! The main thing is that the assembly needs to happen asap, at least for an initial meeting and “priming of the pump” as it were.
Yes, I think the email needs to come from someone with a lot of clout (e.g. a top academic, or a charismatic billionaire; or even a high-ranking government official) if we actually want him to read it and take it seriously.
Here’s a list that’s mostly from just the last few months (that is pretty scary): Deepmind’s Gato, Chinchilla, Flamingo and AlphaCode; Google’s Pathways, PaLM, SayCan, Socratic Models and TPUs; OpenAI’s DALL-E 2; EfficientZero; Cerebras
Selection pressure will cause models to become agentic as they increase in power—those doing the agentic things (following universal instrumental goals like accumulating more resources and self-improvement) will outperform those that don’t. Mesaoptimisation (explainer video) is kind of like cheating—models that create inner optimisers that target something easier to get than what we meant, will be selected (by getting higher rewards) over models that don’t (because we won’t be aware of the inner misalignment). Evolution is a case in point—we are products of it, yet misaligned to its goals (we want sex, and high calorie foods, and money, rather than caring explicitly about inclusive genetic fitness). Without alignment being 100% watertight, powerful AIs will have completely alien goals.