The problem outlined in this post results from two major concerns on
lesswrong: risks from advanced AI systems and irrationality due to
parasitic memes.
It presents the problem of persuasion tools as continuous with the
problems humanity has had with virulent ideologies and sticky memes,
exacerbated by the increasing capability of narrowly intelligent machine
learning systems to exploit biases in human thought. It provides (but
doesn’t explore) two examples from history to support its hypothesis:
the printing press as a partial cause of the 30 years war, and the radio
as a partial cause of 20th century totalitarianism.
I suspect some culturally transmitted parts of the general intelligence
software got damaged by radio, television, and the Internet, with a key
causal step being an increased hypercompetition of ideas compared to
earlier years.
Kokotajlo also briefly considers the hypothesis that
epistemic conditions might have become better through the
internet, but rejects it (for reasons that are not spelled
out, but the answers to Have epistemic conditions always been this
bad? (Wei Dai, 2021) might be illuminating). (Survivorship bias probably plays a large
role here: epistemically unsound information is less likely to survive
long-term trials for truth, especially in an environment where memes on
the less truth-oriented side of the spectrum are in a harsher competition
than memes on the more truth-oriented side).
A large number (>100 mio.) of people in the western world (USA & EU) will interact with chatbots on a regular basis (e.g. more than once a week).
I think this isn’t yet the case: I’ve encountered chatbots mainly in the context of customer service, and don’t know anyone personally who has used a chatbot for entertainment for more than an
afternoon (Western Europe). If we count automated personal assistants such as Alexa or Siri, this might be true.
It is revealed that a government spent a substantial amount of money (>$1B) on automating propaganda creation.
As far as I know, there hasn’t been any reveal of such a large-scale automated propaganda campaign (the wikipedia pages on Propaganda in China and in the US mention no such operations.
Online ideological conflicts spill over into the real world more often.
This seems to become more and more true, with sites such as Gab and the Fediverse gaining in popularity. However, it doesn’t seem like the US Red Internet has the technological capabilities to automate propaganda or build a complete walled garden, to the extent that the US Blue Internet or the Chinese internet do.
I found the text quite relevant both to thinking about possible alternative
stories about the way in which AI could go wrong, and also to my
personal life.
In the domain of AI safety, I became more convinced of the importance
of aligning recommender systems to human values (also mentioned in the
post), if they pose larger risk than commonly assumed, and provide
a good ground for experimentation on alignment techniques. Whether
aligning recommender systems is more
important than aligning large language models seems like an important crux
here: Are the short-term/long-term risks higher for recommender systems
(i.e. reinforcement learners) larger than the risks from large language
models? Which route appears more fruitful when trying to align more
generally capable systems? As far as I can see, the alignment community
is more interested in attempts to align large language models, compared
to recommender systems, probably due to recent progress in that area
and because it’s easier to test alignment in language models (?).
The text explicitely positions the dangers of persuasion tools as a risk
factor, but more speculatively, they might also pose an existential risk
in themselves, in two different scenarios:
If humans are very easy to manipulate by AI systems that are narrowly superhuman in the domain of h
uman psychology, a scenario similar to Evolving to Extinction (Eliezer Yudkowsky, 2007) might occur: nearly everybody goes effectively insane at approximately the same time, resulting in the collapse of civilization.
Humans might become insane enough that further progress along relevant axes is halted, but not insa
ne enough that civilization collapses. We get stuck oscillating around some technological level, unti
l another existential catastrophe like nuclear war and resulting nuclear winter finishes us off.
On the personal side, after being fooled by people using GPT-3 to generate tweets and seeing at least one instance of observing someone asking a commenter for the MD5 hashsum of a
string to verify that the commenter was human (and the commenter failing that
challenge), along with observing the increasingly negative effects of internet usage on my attention span, I decided to separate my place for sleeping & eating from the place where I use internet, with a ~10 minute
commute between those two. I also decided to pay less attention to news stories/reddit/twitter, especially from sources affiliated with large governments, downloaded my favourite websites.
This post was relevant to my thoughts about alternative AI risk scenarios as well as drastic personal decisions, and I expect to give it a 1 or (more likely) a 4 in the final vote.
I really really didn’t mean to make or imply those predictions! That’s all much too soon, it’s only been a year! For a better sense of what my predictions were/are, see this vignette which was written half a year after Persuasion Tools and describes my “median” future.
Other than that, I agree with everything you say in this review. (And the predictions you imagine it making are reasonable, I just wouldn’t have said they’d happen within two years! I’m thinking more like five years. I’m also unsure whether it’ll ever rise to $1B in spending from a single government or whether chatbots will ever become a big deal (the best kinds of persuasion tools now, and possibly always, are non-chatbots).
Oh dear, I didn’t want to imply that those were your predictions! It
was merely an exercise for myself to make the ideas in the post more
concrete. I’ll revise my comment to make that clearer. Also apologies
for not linking your median future vignette, that was an oversight on
my part :-)
I’ll also update my review by making the predictions less conjunctive, perhaps you won’t endorse them anymore afterwards.
The problem outlined in this post results from two major concerns on lesswrong: risks from advanced AI systems and irrationality due to parasitic memes.
It presents the problem of persuasion tools as continuous with the problems humanity has had with virulent ideologies and sticky memes, exacerbated by the increasing capability of narrowly intelligent machine learning systems to exploit biases in human thought. It provides (but doesn’t explore) two examples from history to support its hypothesis: the printing press as a partial cause of the 30 years war, and the radio as a partial cause of 20th century totalitarianism.
Especially those two concerns reminded me of Is Clickbait Destroying Our General Intelligence? (Eliezer Yudkowsky, 2018), which could be situated in this series of events:
Kokotajlo also briefly considers the hypothesis that epistemic conditions might have become better through the internet, but rejects it (for reasons that are not spelled out, but the answers to Have epistemic conditions always been this bad? (Wei Dai, 2021) might be illuminating). (Survivorship bias probably plays a large role here: epistemically unsound information is less likely to survive long-term trials for truth, especially in an environment where memes on the less truth-oriented side of the spectrum are in a harsher competition than memes on the more truth-oriented side).
This post was written a year ago, and didn’t make any concrete predictions (for a vignette of the future by the author, see What 2026 looks like (Daniel’s Median Future) (Daniel Kokotajlo, 2021)). My personal implied predictions under this worldview are something like this:
A large number (>100 mio.) of people in the western world (USA & EU) will interact with chatbots on a regular basis (e.g. more than once a week).
I think this isn’t yet the case: I’ve encountered chatbots mainly in the context of customer service, and don’t know anyone personally who has used a chatbot for entertainment for more than an afternoon (Western Europe). If we count automated personal assistants such as Alexa or Siri, this might be true.
It is revealed that a government spent a substantial amount of money (>$1B) on automating propaganda creation.
As far as I know, there hasn’t been any reveal of such a large-scale automated propaganda campaign (the wikipedia pages on Propaganda in China and in the US mention no such operations.
Online ideological conflicts spill over into the real world more often.
As I haven’t been following the news closely, I don’t have many examples here, but the 2020–21 United States election protests come to mind.
The internet becomes more fractured, into discrete walled gardens (e.g. into a Chinese Internet, US Blue Internet, and US Red Internet).
This seems to become more and more true, with sites such as Gab and the Fediverse gaining in popularity. However, it doesn’t seem like the US Red Internet has the technological capabilities to automate propaganda or build a complete walled garden, to the extent that the US Blue Internet or the Chinese internet do.
I found the text quite relevant both to thinking about possible alternative stories about the way in which AI could go wrong, and also to my personal life.
In the domain of AI safety, I became more convinced of the importance of aligning recommender systems to human values (also mentioned in the post), if they pose larger risk than commonly assumed, and provide a good ground for experimentation on alignment techniques. Whether aligning recommender systems is more important than aligning large language models seems like an important crux here: Are the short-term/long-term risks higher for recommender systems (i.e. reinforcement learners) larger than the risks from large language models? Which route appears more fruitful when trying to align more generally capable systems? As far as I can see, the alignment community is more interested in attempts to align large language models, compared to recommender systems, probably due to recent progress in that area and because it’s easier to test alignment in language models (?).
The scenarios in which AI powered memetic warfare significantly harm humanity can also be tied into research on the malicious use of AI, e.g. The Malicious Use of Artificial Intelligence: Forecasting,
Prevention, and Mitigation (Brundage et al. 2018). Policy tools from diplomacy with regard to biological, chemical and nuclear warfare could be applied to memetic and psychologcial warfare.
The text explicitely positions the dangers of persuasion tools as a risk factor, but more speculatively, they might also pose an existential risk in themselves, in two different scenarios:
If humans are very easy to manipulate by AI systems that are narrowly superhuman in the domain of h uman psychology, a scenario similar to Evolving to Extinction (Eliezer Yudkowsky, 2007) might occur: nearly everybody goes effectively insane at approximately the same time, resulting in the collapse of civilization.
Humans might become insane enough that further progress along relevant axes is halted, but not insa ne enough that civilization collapses. We get stuck oscillating around some technological level, unti l another existential catastrophe like nuclear war and resulting nuclear winter finishes us off.
On the personal side, after being fooled by people using GPT-3 to generate tweets and seeing at least one instance of observing someone asking a commenter for the MD5 hashsum of a
string to verify that the commenter was human (and the commenter failing that challenge), along with observing the increasingly negative effects of internet usage on my attention span, I decided to separate my place for sleeping & eating from the place where I use internet, with a ~10 minute commute between those two. I also decided to pay less attention to news stories/reddit/twitter, especially from sources affiliated with large governments, downloaded my favourite websites.
This post was relevant to my thoughts about alternative AI risk scenarios as well as drastic personal decisions, and I expect to give it a 1 or (more likely) a 4 in the final vote.
Thanks!
I really really didn’t mean to make or imply those predictions! That’s all much too soon, it’s only been a year! For a better sense of what my predictions were/are, see this vignette which was written half a year after Persuasion Tools and describes my “median” future.
Other than that, I agree with everything you say in this review. (And the predictions you imagine it making are reasonable, I just wouldn’t have said they’d happen within two years! I’m thinking more like five years. I’m also unsure whether it’ll ever rise to $1B in spending from a single government or whether chatbots will ever become a big deal (the best kinds of persuasion tools now, and possibly always, are non-chatbots).
Oh dear, I didn’t want to imply that those were your predictions! It was merely an exercise for myself to make the ideas in the post more concrete. I’ll revise my comment to make that clearer. Also apologies for not linking your median future vignette, that was an oversight on my part :-)
I’ll also update my review by making the predictions less conjunctive, perhaps you won’t endorse them anymore afterwards.
oh ok, thanks! Sorry if I misinterpreted you.