habryka comments on AI Safety is Dropping the Ball on Clown Attacks

habryka 23 Oct 2023 1:29 UTC
9 points
4
Not a direct response (and I wouldn’t want it to be read that way), but there is some old Gwern research on this topic: https://www.lesswrong.com/posts/TiG8cLkBRW4QgsfrR/notes-on-brainwashing-and-cults
> “Brainwashing”, as popularly understood, does not exist or is of almost zero effectiveness. The belief stems from American panic over Communism post-Korean War combined with fear of new religions and sensationalized incidents; in practice, “cults” have retention rates in the single percentage point range and ceased to be an issue decades ago. Typically, a conversion sticks because an organization provides value to its members.
Some old SIAI work of mine. Researching this was very difficult because the relevant religious studies area, while apparently completely repudiating most public beliefs about the subject (eg. the effectiveness of brainwashing, how damaging cults are, how large they are, whether that’s even a meaningful category which can be distinguished from mainstream religions rather than a hidden inference—a claim, I will note, which is much more plausible when you consider how abusive Scientology is to its members as compared to how abusive the Catholic Church has been etc), prefer to publish their research in book form, which makes it very hard to review any of it.
Some of the key citation were papers—but the cult panic was so long ago that most of them are not online or have been digitized! I recently added some cites and realized I had not touched the draft in a year; so while this collection of notes is not really up to my preferred standards, I’m simply posting it for what it’s worth. (One lesson to take away from this is that controlling uploaded human brains will not be nearly as simple & easy as applying classic ‘brainwashing’ strategies—because those don’t actually work.)
- Thane Ruthenis 23 Oct 2023 12:34 UTC
  5 points
  11
  Parent
  My impression is that what trevor refers to as “brainwashing” and “mind control” is not actually “brainwashing as popularly understood”, i. e. a precision-targeted influence that quickly and unrecognizably warps the mind of an individual. Rather, what they have in mind is a more diffuse/incremental effect, primarily noticeable at population-wide scales and with the individual effects being lesser and spread across longer time periods — but those effects nevertheless being pivotal, when it comes to the fate of movements. And this is in fact a thing that we more or less know exists, inasmuch as propaganda and optimizing for engagement are real things.
  Then there’s a separate claim building up on that, a speculation that AI and the Big Data may allow to supercharge these effects into something that may start to look like brainwashing-as-popularly-understood. But I think the context provided by the first claim makes this more sensible.
  It’s the general tendency I’ve somewhat noticed with trevor’s posts — they seem to have good content, but the framing/language employed has a hint of “mad conspiracy-theory rambling” that puts people off. @trevor, maybe watch out for that? E. g., I’d dial down on terms like “mind control”, replace them with more custom/respectable-looking ones. (Though I get that you may be deliberately using extreme terms to signal the perceived direness of the issue, and I can’t really say they’re inaccurate. But maybe look for a way to have your cake and eat it too?)
  - trevor 23 Oct 2023 17:02 UTC
    7 points
    0
    Parent
    Yeah, I spent several years staying quiet about this because I assumed that bad things happened to people who didn’t. When I realized that that was a vague reason, and that everyone I talked to also seemed to have vague reasons for not thinking about this, I panicked and wrote a post as fast as possible by typing up handwritten notes and stitching the paragraphs together. That was a pretty terrible mistake. By the time I realized that it was longer than EY’s List of Lethalities, it was already too late, and I figured that everyone would ignore it if I didn’t focus really hard on the hook.
- trevor 23 Oct 2023 2:04 UTC
  2 points
  0
  Parent
  I absolutely agree that this is a good way to look at things. For example, the 3 minutes per person per day moloch I referenced was a hypothetical bad future that a lot of people worried about, but as it turned out, the capabilities to use gradient descent to steer human behavior in measurable directions may have resulted in a good outcome, where the superior precision allows them to reduce quit rates, but balancing that optimization with optimizing for preventing overuse. This featured heavily in the Facebook files; whenever Facebook encountered some awful problem, the proposed solution was allegedly “we need better AI so we can optimize for things like that not happening”.
  I don’t want to dismiss the potentially-high probability that things will just go fine; in fact, I actually covered that somewhat:
  Facebook and the other 4 large tech companies (of whom Twitter/X is not yet a member due to vastly weaker data security) might be testing out their own pro-democracy anti-influence technologies and paradigms, akin to Twitter/X’s open-sourcing its algorithm, but behind closed doors due to the harsher infosec requirements that the big 5 tech companies face. Perhaps there are ideological splits among executives e.g. with some executives trying to find a solution to the influence problem because they’re worried about their children and grandchildren ending up as floor rags in a world ruined by mind control technology, and other executives nihilistically marching towards increasingly effective influence technologies so that they and their children personally have better odds of ending up on top instead of someone else.
  I’m just advocating for being prepared both for the good outcome and the bad one. I think that the 2020s will be a flashpoint for this, especially if it’s determined that an AI pause really is the minimum ask for humanity to survive (which is a reasonable proposition).
- the gears to ascension 23 Oct 2023 1:53 UTC
  2 points
  0
  Parent
  this very badly mismatches my firsthand experiences and observations of social movements in a way that makes me suspect a mismatch between what we even mean by the word “brainwashing” or something. Perhaps it is because I did not need the update this intends to convey, which is that brainwashing does not allow forcibly replacing your opinions, but rather works by creating social pressure to agree habitually with a group?
  - trevor 23 Oct 2023 2:12 UTC
    2 points
    0
    Parent
    I think that Habryka was referring to the tendency for people to worry about mind control and manipulation, not comparing human manipulation via gradient descent to human manipulation via brainwashing.
    Personally, I think that worrying about advances in human manipulation is always something to be concerned about, since the human brain is a kludge of spaghetti code, so surely someone would find something eventually (I argued here that social media dramatically facilitated the process of people finding things), and it naturally follows that big discoveries there could be transformative, even if there were false alarms in the past (with 20th century technology). But the fact that the false alarms happened at all is also worth consideration.