johnswentworth comments on [$20K in Prizes] AI Safety Arguments Competition

johnswentworth 26 Apr 2022 17:55 UTC
LW: 60 AF: 26
AF
I’d like to complain that this project sounds epistemically absolutely awful. It’s offering money for arguments explicitly optimized to be convincing (rather than true), it offers money only for prizes making one particular side of the case (i.e. no money for arguments that AI risk is no big deal), and to top it off it’s explicitly asking for one-liners.
I understand that it is plausibly worth doing regardless, but man, it feels so wrong having this on LessWrong.
What links here?
- Reshaping the AI Industry by Thane Ruthenis (29 May 2022 22:54 UTC; 147 points)
- Not Relevant 26 Apr 2022 20:02 UTC
  27 points
  Parent
  If the world is literally ending, and political persuasion seems on the critical path to preventing that, and rationality-based political persuasion has thus far failed while the empirical track record of persuasion for its own sake is far superior, and most of the people most familiar with articulating AI risk arguments are on LW/AF, is it not the rational thing to do to post this here?
  
  I understand wanting to uphold community norms, but this strikes me as in a separate category from “posts on the details of AI risk”. I don’t see why this can’t also be permitted.
  - johnswentworth 26 Apr 2022 20:26 UTC
    26 points
    Parent
    TBC, I’m not saying the contest shouldn’t be posted here. When something with downsides is nonetheless worthwhile, complaining about it but then going ahead with it is often the right response—we want there to be enough mild stigma against this sort of thing that people don’t do it lightly, but we still want people to do it if it’s really clearly worthwhile. Thus my kvetching.
    (In this case, I’m not sure it is worthwhile, compared to some not-too-much-harder alternative. Specifically, it’s plausible to me that the framing of this contest could be changed to not have such terrible epistemics while still preserving the core value—i.e. make it about fast, memorable communication rather than persuasion. But I’m definitely not close to 100% sure that would capture most of the value.
    Fortunately, the general policy of imposing a complaint-tax on really bad epistemics does not require me to accurately judge the overall value of the proposal.)
    - Not Relevant 26 Apr 2022 20:45 UTC
      9 points
      Parent
      I’m all for improving the details. Which part of the framing seems focused on persuasion vs. “fast, effective communication”? How would you formalize “fast, effective communication” in a gradeable sense? (Persuasion seems gradeable via “we used this argument on X people; how seriously they took AI risk increased from A to B on a 5-point scale”.)
      - Liam Donovan 27 Apr 2022 4:28 UTC
        4 points
        Parent
        Maybe you could measure how effectively people pass e.g. a multiple choice version of an Intellectual Turing Test (on how well they can emulate the viewpoint of people concerned by AI safety) after hearing the proposed explanations.
        [Edit: To be explicit, this would help further John’s goals (as I understand them) because it ideally tests whether the AI safety viewpoint is being communicated in such a way that people can understand and operate the underlying mental models. This is better than testing how persuasive the arguments are because it’s a) more in line with general principles of epistemic virtue and b) is more likely to persuade people iff the specific mental models underlying AI safety concern are correct.
        One potential issue would be people bouncing off the arguments early and never getting around to building their own mental models, so maybe you could test for succinct/high-level arguments that successfully persuade target audiences to take a deeper dive into the specifics? That seems like a much less concerning persuasion target to optimize, since the worst case is people being wrongly persuaded to “waste” time thinking about the same stuff the LW community has been spending a ton of time thinking about for the last ~20 years]
    - Raemon 26 Apr 2022 22:17 UTC
      7 points
      Parent
      This comment thread did convince me to put it on personal blog (previously we’ve frontpaged writing-contents and went ahead and unreflectively did it for this post)
      - Yonadav Shavit 26 Apr 2022 22:35 UTC
        2 points
        Parent
        I don’t understand the logic here? Do you see it as bad for the contest to get more attention and submissions?
        johnswentworth 26 Apr 2022 22:56 UTC
        16 points
        Parent
        No, it’s just the standard frontpage policy:
        Frontpage posts must meet the criteria of being broadly relevant to LessWrong’s main interests; timeless, i.e. not about recent events; and are attempts to explain not persuade.
        Technically the contest is asking for attempts to persuade not explain, rather than itself attempting to persuade not explain, but the principle obviously applies.
        As with my own comment, I don’t think keeping the post off the frontpage is meant to be a judgement that the contest is net-negative in value; it may still be very net positive. It makes sense to have standard rules which create downsides for bad epistemics, and if some bad epistemics are worthwhile anyway, then people can pay the price of those downsides and move forward.
        Ruby 27 Apr 2022 2:13 UTC
        14 points
        Parent
        Raemon and I discussed whether it should be frontpage this morning. Prizes are kind of an edge case in my mind. They don’t properly fulfill the frontpage criteria but also it feels like they deserve visibility in a way that posts on niche topics don’t, so we’ve more than once made an exception for them.
        
        I didn’t think too hard about the epistemics of the post when I made the decision to frontpage, but after John pointed out the suss epistemics, I’m inclined to agree, and concurred with Raemon moving it back to Personal.
        
        ----
        
        I think the prize could be improved simply by rewarding the best arguments in favor and against AI risk. This might actually be more convincing to the skeptics – we paid people to argue against this position and now you can see the best they came up with.
  - tamgent 3 May 2022 19:01 UTC
    2 points
    Parent
    Ah, instrumental and epistemic rationality clash again
- lc 20 May 2022 22:12 UTC
  19 points
  Parent
  We’re out of time. This is what serious political activism involves.
  - trevor 25 May 2022 23:47 UTC
    4 points
    Parent
    I don’t see any lc comments, and I really wish I could see some here because I feel like they’d be good.
    Let’s go! Let’s go! Crack open an old book and let the ideas flow! The deadline is, like, basically tomorrow.
    - lc 27 May 2022 2:12 UTC
      6 points
      Parent
      Ok :)
- Lone Pine 26 Apr 2022 18:46 UTC
  13 points
  Parent
  Most movements (and yes, this is a movement) have multiple groups of people, perhaps with degrees in subjects like communication, working full time coming up with slogans, making judgments about which terms to use for best persuasiveness, and selling the cause to the public. It is unusual for it to be done out in the open, yes. But this is what movements do when they have already decided what they believe and now have policy goals they know they want to achieve. It’s only natural.
  - hath 26 Apr 2022 19:00 UTC
    21 points
    Parent
    You didn’t refute his argument at all, you just said that other movements do the same thing. Isn’t the entire point of rationality that we’re meant to be truth-focused, and winning-focused, in ways that don’t manipulate others? Are we not meant to hold ourselves to the standard of “Aim to explain, not persuade”? Just because others in the reference class of “movements” do something doesn’t mean it’s immediately something we should replicate! Is that not the obvious, immediate response? Your comment proves too much; it could be used to argue for literally any popular behavior of movements, including canceling/exiling dissidents.
    Do I think that this specific contest is non-trivially harmful at the margin? Probably not. I am, however, worried about the general attitude behind some of this type of recruitment, and the justifications used to defend it. I become really fucking worried when someone raises an entirely valid objection, and is met with “It’s only natural; most other movements do this”.
    - P. 26 Apr 2022 19:26 UTC
      −1 points
      Parent
      To the extent that rationality has a purpose, I would argue that it is to do what it takes to achieve our goals, if that includes creating “propaganda”, so be it. And the rules explicitly ask for submissions not to be deceiving, so if we use them to convince people it will be a pure epistemic gain.
      
      Edit: If you are going to downvote this, at least argue why. I think that if this works like they expect, it truly is a net positive.
      - hath 27 Apr 2022 0:02 UTC
        1 point
        Parent
        If you are going to downvote this, at least argue why.
        Fair. Should’ve started with that.
        To the extent that rationality has a purpose, I would argue that it is to do what it takes to achieve our goals,
        I think there’s a difference between “rationality is systematized winning” and “rationality is doing whatever it takes to achieve our goals”. That difference requires more time to explain than I have right now.
        if that includes creating “propaganda”, so be it.
        I think that if this works like they expect, it truly is a net positive.
        I think that the whole AI alignment thing requires extraordinary measures, and I’m not sure what specifically that would take; I’m not saying we shouldn’t do the contest. I doubt you and I have a substantial disagreement as to the severity of the problem or the effectiveness of the contest. My above comment was more “argument from ‘everyone does this’ doesn’t work”, not “this contest is bad and you are bad”.
        Also, I wouldn’t call this contest propaganda. At the same time, if this contest was “convince EAs and LW users to have shorter timelines and higher chances of doom”, it would be reacted to differently. There is a difference, convincing someone to have a shorter timeline isn’t the same as trying to explain the whole AI alignment thing in the first place, but I worry that we could take that too far. I think that (most of) the responses John’s comment got were good, and reassure me that the OPs are actually aware of/worried about John’s concerns. I see no reason why this particular contest will be harmful, but I can imagine a future where we pivot to mainly strategies like this having some harmful second-order effects (which need their own post to explain).
- Sidney Hough 26 Apr 2022 20:52 UTC
  8 points
  Parent
  Hey John, thank you for your feedback. As per the post, we’re not accepting misleading arguments. We’re looking for the subset of sound arguments that are also effective.
  We’re happy to consider concrete suggestions which would help this competition reduce x-risk.
  - jacobjacob 26 Apr 2022 22:23 UTC
    6 points
    Parent
    Thanks for being open to suggestions :) Here’s one: you could award half the prize pool to compelling arguments against AI safety. That addresses one of John’s points.
    For example, stuff like “We need to focus on problems AI is already causing right now, like algorithmic fairness” would not win a prize, but “There’s some chance we’ll be better able to think about these issues much better in the future once we have more capable models that can aid our thinking, making effort right now less valuable” might.
    - Chris_Leong 26 Apr 2022 23:10 UTC
      19 points
      Parent
      That idea seems reasonable at first glance, but upon reflection, I think it’s a really bad idea. It’s one thing to run a red-teaming competition, it’s another to spend money building rhetorically optimised tools for the other side. If we do that, then maybe there was no point running the competition in the first place as it might all cancel out.
      - Ruby 27 Apr 2022 2:14 UTC
        8 points
        Parent
        This makes sense if you assume things are symmetric. Hopefully there’s enough interest in truth and valid reasoning that if the “AI is dangerous” conclusion is correct, it’ll have better arguments on its side.
    - Sidney Hough 26 Apr 2022 23:07 UTC
      4 points
      Parent
      Thanks for the idea, Jacob. Not speaking on behalf of the group here—but my first thought is that enforcing symmetry on discussion probably isn’t a condition for good epistemics, especially since the distribution of this community’s opinions is skewed. I think I’d be more worried if particular arguments that were misleading went unchallenged, but we’ll be vetting submissions as they come in, and I’d also encourage anyone who has concerns with a given submission to talk with the author and/or us. My second thought is that we’re planning a number of practical outreach projects that will make use of the arguments generated here—we’re not trying to host an intra-community debate about the legitimacy of AI risk—so we’d ideally have the prize structure reflect the outreach value for which arguments are responsible.
      I’m potentially up to opening the contest to arguments for or against AI risk, and allowing the distribution of responses to reflect the distribution of the opinions of the community. Will discuss with the rest of the group.
    - Thomas Kwa 27 Apr 2022 7:06 UTC
      3 points
      Parent
      It seems better to award some fraction of the prize pool to refutations of the posted arguments. IMO the point isn’t to be “fair to both sides”, it’s to produce truth.
      - Davidmanheim 27 Apr 2022 10:01 UTC
        22 points
        Parent
        Wait, the goal here, at least, isn’t to produce truth, it is to disseminate it. Counter-arguments are great, but this isn’t about debating the question, it’s about communicating a conclusion well.
        Yitz 27 Apr 2022 17:04 UTC
        6 points
        Parent
        This is PR, not internal epistemics, if I’m understanding the situation correctly.
- Davidmanheim 27 Apr 2022 9:59 UTC
  LW: 7 AF: 3
  AF Parent
  Think of it as a “practicing a dark art of rationality” post, and I’d think it would seem less off-putting.
  - Ben Pace 27 Apr 2022 10:06 UTC
    LW: 4 AF: 2
    AF Parent
    I think it would be less “off-putting” if we had common knowledge of it being such a post. I think the authors don’t think of it as that from reading Sidney’s comment.