As I think more about this, the LLM as a collaborator alone might have a major impact. Just off the top of my head, a kind of Rube Goldberg attack might be <redacted for info hazard>. Thinking about it in one’s isolated mind, someone might never consider carrying something like that out. Again, I am trying to model the type of person who carries out a real attack, and I don’t estimate that person having above-average levels of self confidence. I suspect the default is to doubt themselves enough to avoid acting in the same way most people do about their entreprenurial ideas.
However, if they either presented it to an LLM for refinement, or if the LLM suggested it, there could be just enough psychological boost of validity to push them over the edge to trying it. And after a few successes on the news of either “dumb” or “bizarre” or “innovative” attacks being successful due to “AI telling these people how to do it” then the effect might get even stronger.
To my knowledge, one could have bought an AR-15 since the mid to late 1970s. My cousin has a Colt from 1981 he bought when he was 19. Yet people weren’t mass shooting each other, even during times when the overall crime/murder rate was higher than it is now. Some confluence of factors has driven the surge, one of them probably being a strong meme, “Oh, this actually tends to ‴work.″” Basically, a type of social proofing of efficacy.
And I am willing to bet $100 that the media will report big on the first few cases of “Weird Attacks Designed by AI.”
It seems obvious to me that the biggest problems in alignment are going to be the humans, both long before the robots, and probably long after.
As I think more about this, the LLM as a collaborator alone might have a major impact. Just off the top of my head, a kind of Rube Goldberg attack might be <redacted for info hazard>. Thinking about it in one’s isolated mind, someone might never consider carrying something like that out. Again, I am trying to model the type of person who carries out a real attack, and I don’t estimate that person having above-average levels of self confidence. I suspect the default is to doubt themselves enough to avoid acting in the same way most people do about their entreprenurial ideas.
However, if they either presented it to an LLM for refinement, or if the LLM suggested it, there could be just enough psychological boost of validity to push them over the edge to trying it. And after a few successes on the news of either “dumb” or “bizarre” or “innovative” attacks being successful due to “AI telling these people how to do it” then the effect might get even stronger.
To my knowledge, one could have bought an AR-15 since the mid to late 1970s. My cousin has a Colt from 1981 he bought when he was 19. Yet people weren’t mass shooting each other, even during times when the overall crime/murder rate was higher than it is now. Some confluence of factors has driven the surge, one of them probably being a strong meme, “Oh, this actually tends to ‴work.″” Basically, a type of social proofing of efficacy.
And I am willing to bet $100 that the media will report big on the first few cases of “Weird Attacks Designed by AI.”
It seems obvious to me that the biggest problems in alignment are going to be the humans, both long before the robots, and probably long after.