jkraybill comments on Top lesson from GPT: we will probably destroy humanity “for the lulz” as soon as we are able.

jkraybill 17 Apr 2023 16:02 UTC
0 points
0
Seeing this frantic race from random people to give GPT-4 dangerous tools and walking-around-money, I agree: the risk is massively exacerbated by giving the “parent” AI’s to humans.
Upon reflection, should that be surprising? Are humans “aligned” how we would want AI to be aligned? If so, we must acknowledge the fact that humanity regularly produces serial killers and terrorists (etc). Doesn’t seem ideal. How much more aligned can we expect a technology we produce, vs our own species?
If we view the birth of AGI as the birth of a new kind of child, to me, there really is no regime known to humanity that will guarantee that child will not grow up to become an evil monster: we’ve been struggling with that question for millenia as humans. One thing we definitely have found is that super-evil parents are way better than average at producting super-evil children, but sometimes it seems like super-evil children just come into being, despite their parents. So a super-evil person controlling/training/scripting an AI to me is a huge risk factor, but so are the random factors that created super-evil humans despite good parents. So IMO the super-evil scammer/script kiddie/terrorist is the primary (but not only) risk factor when opening access to these new models.
I’m coming around to this argument that it’s good right now that people are agent-ifying GPT-4 and letting it have root access, try to break CAPTCHAs, speak to any API etc, because that will be the canary in the coal mine—I just hope that the canary in the coal mine will give us ample notice to get out of the mine!