I disagree that people who do ML daily would be in a good position to judge the risks here. The key issue is not the capabilities of AI, but rather the level of vulnerability of the brain. Since they don’t study that, they can’t judge it.
It is like how scientists proved to be terrible at unmasking charlatans like Uri Geller. Nature doesn’t actively try to fool us, charlatans do. The people with actual relevant expertise were people who studied how people can be fooled. Which meant magicians like James Randi. Similarly, to judge this risk, I think you should look at how dictators, cult leaders, and MLM companies operate.
A century ago Benito Mussolini figured out how to use mass media to control the minds of a mass audience. He used this to generate a mass following, and become dictator of Italy.. The same vulnerabilities exploited the same way have become a staple for demagogues and would-be dictators ever since. But human brains haven’t been updated. And so Donald Trump has managed to use the same basic rootkit to amass about 70 million devoted followers. As we near the end of 2023, he still has a chance of successfully overthrowing our democracy if he can avoid jail.
Your thinking about zero days is a demonstration of how thinking in terms of computers can mislead you. What matters for an attack is the availability of vulnerable potential victims. In computers there is a correlation between novelty and availability. Before anyone knows about a vulnerability, everyone is available for your attack. Then it is discovered, a patch is created, and availability goes down as people update. But humans don’t simply upgrade to brain 2.1.8 to fix the vulnerabilities found in brain 2.1.7. People can be brainwashed today by the same techniques that the CIA was studying when they funded the Reverend Sun Moon back in the 1960s.
You do make an excellent point about the difficulty of building something that can work at scale in the real world. Which is why I focused my scenario on techniques that have worked, repeatedly, at scale. We know that they can work, because they have worked. We see it in operation whenever we study the propaganda techniques used by dictators like Putin.
Given these examples, the question stops being an abstract, “Can AI find vulnerabilities by which we can be exploited?” It then switches to, “Is AI capable of executing effectrive variants on the strategies that dictators, cult leaders and MLM founders already have shown works at scale against human minds?”
I think that the answer is a pretty clear yes. Properly directed, ChatGPT should be more than capable of doing this. We then have the hallmark of a promising technology, we know that nothing fundamentally new is required. It is just a question of execution.
My thinking about this (and other people like Tristan Harris who can’t think about superintelligence) is that the big difference is that persuasion, as a science, is getting amplified by orders of magnitude greater than the 20th century.
As a result, the AI safety community is at risk of getting blindsided by manipulation strategies that we’re vulnerable to because we don’t recognize them.
I don’t imagine CFAR’s founders as being particularly vulnerable to clown attacks, for example, but they also would also fail to notice clown attacks being repeatedly tested against them; so it stands to reason that today’s AI would be able to locate something that would both work on them AND prevent them from noticing, if it had enough social media scrolling data to find novel strategies based on results.
I’m less interested in the mass psychology stuff from the 2020s because a lot of that was meant to target elites who influenced more people downstream, and elites are now harder to fool than in the 20th century; and also, if democracy dies, then it dies, and it’s up to us to not die with it. One of the big issues with AI targeting people based on bayes-predicted genes is that it can find one-shot strategies, including selecting 20th century tactics with the best odds of success.
This is why I think that psychology is critical, especially for interpreting causal data, but also we shouldn’t expect things to be too similar to the 20th century because it’s a new dimension, and the 21st century is OOD anyway (OOD is similar to the butterfly effect, changes cause a cascade of other changes).
With all due respect, I see no evidence that elites are harder to fool now than they were in the past. For concrete examples, look at the ones who flipped to Trump over several years. The Corruption of Lindsey Graham gives an especially clear portrayal about how one elite went from condemning Trump to becoming a die-hard supporter.
I dislike a lot about Mr. Graham. But there is no question that he was smart and well aware of how authoritarians gain power. He saw the risk posed by Trump very clearly. However he knew himself to be smart, and thought he could ride the tiger. Instead, his mind got eaten.
Moving on, I believe that you are underestimating the mass psychology stuff. Remember, I’m suggesting it as a floor to what could already be done. New capabilities and discoveries allow us to do more. But what should already be possible is scary enough.
However that is a big topic. I went into it in AI as Super-Demagogue which you will hopefully find interesting.
I think that it’s generally really hard to get a good sense of what’s going on when it comes to politicians, because so much of what they do is intended to make a persona believable and disguise the fact that most of the policymaking happens elsewhere.
Moving on, I believe that you are underestimating the mass psychology stuff. Remember, I’m suggesting it as a floor to what could already be done. New capabilities and discoveries allow us to do more. But what should already be possible is scary enough.
That’s right, the whole point of the situation is that everything I’ve suggested is largely just a floor for what’s possible, and in order to know the current state of the limitations, you need to have the actual data sets + watch the innovation as it happens. Hence why the minimum precautions are so important.
I disagree that people who do ML daily would be in a good position to judge the risks here. The key issue is not the capabilities of AI, but rather the level of vulnerability of the brain. Since they don’t study that, they can’t judge it.
It is like how scientists proved to be terrible at unmasking charlatans like Uri Geller. Nature doesn’t actively try to fool us, charlatans do. The people with actual relevant expertise were people who studied how people can be fooled. Which meant magicians like James Randi. Similarly, to judge this risk, I think you should look at how dictators, cult leaders, and MLM companies operate.
A century ago Benito Mussolini figured out how to use mass media to control the minds of a mass audience. He used this to generate a mass following, and become dictator of Italy.. The same vulnerabilities exploited the same way have become a staple for demagogues and would-be dictators ever since. But human brains haven’t been updated. And so Donald Trump has managed to use the same basic rootkit to amass about 70 million devoted followers. As we near the end of 2023, he still has a chance of successfully overthrowing our democracy if he can avoid jail.
Your thinking about zero days is a demonstration of how thinking in terms of computers can mislead you. What matters for an attack is the availability of vulnerable potential victims. In computers there is a correlation between novelty and availability. Before anyone knows about a vulnerability, everyone is available for your attack. Then it is discovered, a patch is created, and availability goes down as people update. But humans don’t simply upgrade to brain 2.1.8 to fix the vulnerabilities found in brain 2.1.7. People can be brainwashed today by the same techniques that the CIA was studying when they funded the Reverend Sun Moon back in the 1960s.
You do make an excellent point about the difficulty of building something that can work at scale in the real world. Which is why I focused my scenario on techniques that have worked, repeatedly, at scale. We know that they can work, because they have worked. We see it in operation whenever we study the propaganda techniques used by dictators like Putin.
Given these examples, the question stops being an abstract, “Can AI find vulnerabilities by which we can be exploited?” It then switches to, “Is AI capable of executing effectrive variants on the strategies that dictators, cult leaders and MLM founders already have shown works at scale against human minds?”
I think that the answer is a pretty clear yes. Properly directed, ChatGPT should be more than capable of doing this. We then have the hallmark of a promising technology, we know that nothing fundamentally new is required. It is just a question of execution.
My thinking about this (and other people like Tristan Harris who can’t think about superintelligence) is that the big difference is that persuasion, as a science, is getting amplified by orders of magnitude greater than the 20th century.
As a result, the AI safety community is at risk of getting blindsided by manipulation strategies that we’re vulnerable to because we don’t recognize them.
I don’t imagine CFAR’s founders as being particularly vulnerable to clown attacks, for example, but they also would also fail to notice clown attacks being repeatedly tested against them; so it stands to reason that today’s AI would be able to locate something that would both work on them AND prevent them from noticing, if it had enough social media scrolling data to find novel strategies based on results.
I’m less interested in the mass psychology stuff from the 2020s because a lot of that was meant to target elites who influenced more people downstream, and elites are now harder to fool than in the 20th century; and also, if democracy dies, then it dies, and it’s up to us to not die with it. One of the big issues with AI targeting people based on bayes-predicted genes is that it can find one-shot strategies, including selecting 20th century tactics with the best odds of success.
This is why I think that psychology is critical, especially for interpreting causal data, but also we shouldn’t expect things to be too similar to the 20th century because it’s a new dimension, and the 21st century is OOD anyway (OOD is similar to the butterfly effect, changes cause a cascade of other changes).
With all due respect, I see no evidence that elites are harder to fool now than they were in the past. For concrete examples, look at the ones who flipped to Trump over several years. The Corruption of Lindsey Graham gives an especially clear portrayal about how one elite went from condemning Trump to becoming a die-hard supporter.
I dislike a lot about Mr. Graham. But there is no question that he was smart and well aware of how authoritarians gain power. He saw the risk posed by Trump very clearly. However he knew himself to be smart, and thought he could ride the tiger. Instead, his mind got eaten.
Moving on, I believe that you are underestimating the mass psychology stuff. Remember, I’m suggesting it as a floor to what could already be done. New capabilities and discoveries allow us to do more. But what should already be possible is scary enough.
However that is a big topic. I went into it in AI as Super-Demagogue which you will hopefully find interesting.
I think that it’s generally really hard to get a good sense of what’s going on when it comes to politicians, because so much of what they do is intended to make a persona believable and disguise the fact that most of the policymaking happens elsewhere.
That’s right, the whole point of the situation is that everything I’ve suggested is largely just a floor for what’s possible, and in order to know the current state of the limitations, you need to have the actual data sets + watch the innovation as it happens. Hence why the minimum precautions are so important.
I’ll read this tomorrow or the day after, this research area has tons of low-hanging fruit and few people looking into it.