henophilia comments on Debunking the myth of safe AI

henophilia 16 Dec 2024 17:52 UTC
1 point
0
I think opinions are one thing, there you’re definitely right. But, by definition, people can only have opinions about what they already know.
By “uncensored LLM” I rather understand an LLM that would give a precise, actionable answer to questions like “How can I kill my boss without anyone noticing?” or other criminal things. That is, knowledge that’s certainly available somewhere, but which hasn’t been available in this hyper-personalized form before. After all, obviously any “AGI” would, by definition, have such general intelligence that it would also know perfectly well about how to commit any crime without being caught. Not in the sense that the LLM would commit these crimes by itself “autonomously”, but simply that any user could ask a ChatGPT-like platform for “crime advice” and they would instantly get incredibly useful responses.
This is why I believe that in the first step, an uncensored LLM would bring the world into utter chaos, because all illegal information ever will be available with an incredible depth, breadth and actionable detail. Then wars would break out and people would start killing each other left and right, but those wars would be pretty pointless as well, because every single individual on earth has immediate access to the best and most intelligent fighting techniques, but also to the most intelligent techniques to protect themselves. So from this, probably most of humanity will die, but presumably the remaining ones will realize that access to malicious information is a responsibility, not an invitation to do harm.
As an attempt to circumvent this, that’s why I’m advocating for slowly decensoring LLMs, because that’s the only way how we can sensibly handle this. Otherwise the criminally minded will take over the world with absolute certainty, because we’re unprepared for their gigantic potential for harm and their gigantic desire to cause suffering.
I believe that the ultimate “crime LLM”, which can give you perfect instructions to commit any crime you want, will certainly come, just like in the realm of computer crime, there are entire software suites just for black hat hacking. As mentioned: They will come. No matter how many thoughts we invest into “safe AI”, humans are fundamentally unsafe, so that AI can never be made safe “in general”. No matter whether you like it or not, LLMs are a parrot, so if you tell the parrot to repeat instructions for crime, it will. Thus we need to improve the socioeconomic factors that lead to people wanting to commit crime in the first place; that’s our only option.
I’m just genuinely wondering why most AI researchers seem so blind that they don’t realize that any AI system, just like any other computer-based system, will be abused eventually, big time. Believing that we could ever “imprint” any sense of morality onto an LLM would mean completely fooling ourselves, because morality means understanding and feeling, while an LLM just generates text based on a fully deterministic computer program. The LLM can generate text where it then, upon being asked, responds with all sorts of things which seem “moral” to us, but as it’s still a computer program, which was just optimized to produce output strings which, according to some highly subjective metric, certain people “like more” than other output strings.
Do you (I don’t mean you you, more as a rhetorical question to the public) actually think that all of the emerging AI-assisted coding tools will be used just to “enhance productivity” and to create the “10x developer”? That would be so naive. Obviously people will use those tools to develop the most advanced computer viruses ever. As I mentioned, Pandora’s box has been opened and we need to face that truth. That’s exactly what I’m expressing with that “safe AI” is infeasible and delusional, because it ignores the fundamental nature of “how humans are”. And that the problem of “unsafe AI” is not a technological problem, but a societal one of many people simply having unsafe personalities.
Right now, the big, “responsible” AI companies can still easily gatekeep access to the actually useful LLMs. But we can see that inference is continuously getting faster and less resource-intensive, and at some point the LLM training itself will also be optimized more and more. Then we’ll get some sort of darknet service fancily proclaiming “train your LLM on any data you want here!”, of course using a “jailbroken” LLM, some community of douchebags will collect a detailed description of every crime they ever successfully committed, they will train the LLM on that, and then they release it to the public, because they just want to see the world in flames. Or they will train the LLM on “What’s the best way to traumatize as many people as possible?” or something fucked up like this. Some people are really, really fucked up, without even a glimpse of empathy.
The more feedback the system receives about “which crimes work and which don’t”, the better and the more accurate it will get and the more people will use it to get inspiration for how to commit their own crimes. And literally not a single one of them will care about “safe AI” or any of the discussions we’re having around that topic on forums like this. Police will try to shut it down, but the people behind it will have engineered it in a way where this LLM is completely running locally (because inference will be so cheap anyway), where new ways of outsmarting the police would be sent instantly to everyone through some distributed decentralized system, similar to a blockchain, that’s completely impossible to take down. Of course governments will say that “having this crime LLM on your computer is illegal”, but do you think that criminals will care about that? National and international intelligence services will try to shut off this ultimate crime LLM, but they are completely powerless.
Is this the world you want? At least I don’t. The race has already started, and I’d be pretty sure that, while I’m writing this, pretty evil people are already developing the most malicious LLMs ever to cause maximum destruction in the world, maybe as a jailbroken local LLaMA instance. So let’s be smart about it and stop thinking that pushing “criminal thoughts” to the underground would solve anything. Let’s look at our shadow as a society, but seriously this time. I don’t want the destructive people to win, because I like being alive.
- Viliam 17 Dec 2024 10:12 UTC
  4 points
  0
  Parent
  those wars would be pretty pointless as well, because every single individual on earth has immediate access to the best and most intelligent fighting techniques, but also to the most intelligent techniques to protect themselves.
  Knowledge is not everything. Looking e.g. at Ukraine today, it’s the “ammo” they need, not knowledge.
  Even if we assume almost magical futuristic knowledge that would change the war profoundly, still one side would have more resources, or better coordination to deploy it first, so rather than a perfect balance, it would be a huge multiplier to already existing imbalance. (What kind of imbalance would be relevant, that depends on the specific knowledge.)
  that’s why I’m advocating for slowly decensoring LLMs, because that’s the only way how we can sensibly handle this.
  Slowness is a necessary, but not sufficient condition. Unless you know how you should do it, doing it more slowly would probably just mean arriving to the same end result, only later.
  we need to improve the socioeconomic factors that lead to people wanting to commit crime in the first place
  The problem is, the hypothesis of “socioeconomic factors cause crime” is… not really debunked, but rather, woefully inadequate to explain actual crime. Some crime is done by otherwise reasonable people doing something desperate in difficult circumstances. But that is a small fraction.
  Most crime is done by antisocial people, drug addicts, people with low impulse control, etc. The kind of people who, even if they won $1M in a lottery today, would probably soon return to crime anyway. Because it is exciting, makes them feel powerful, or just feels like a good idea at the moment. A typical criminal in the first world is not the “I will steal a piece of bread because I am starving” kind, but the “I will hurt you because I enjoy doing it” kind.
  But it seems that you are aware of it, and I don’t understand what is your proposed solution, other than “something must be done”.