habryka comments on The Alignment Forum should have more transparent membership standards

habryka 4 Jun 2021 20:29 UTC
16 points
This is good to know about! I simply never knew that was a chat button, and I guess Owen and our mod intermediary didn’t know about it since it didn’t come up? I bet we could have saved a lot of trouble if we’d first talked through this a few months ago.
The mod intermediary (Ben Pace) definitely knew, and Owen has also messaged us quite a few times on Intercom in the past, so not sure what exactly went wrong here. Looking at the chat logs, I think it was mostly a failure of process by the LW team, caused by FB messenger being such an informal chat-tool that’s usually used non-professionally. So I think Ben treated the requests more as a random informal chat among friends instead of a thing to pay a lot of attention to (and he was also particularly busy with two non-AIAF projects both times he was messaged by Owen).
For example, does my track record satisfy the criteria here? I mean this sincerely and I don’t mind whatever the answer is. I just want to try to make this clearer to people who are interested in the AF, since people can see my AF post, website, google scholar, this LW post, etc.
We did indeed think about whether you would meet the bar, and decided you don’t currently. Though the post with Owen was great. I think we would likely try to find some references and ask some other people in the field if they got value out of your work to make a final call here, but I think at the moment we wouldn’t give you membership without some additional highly-relevant papers or really great references.
I do think we might have a particular blindspot around transparency work, which is something I’ve been wanting to fix. I haven’t looked much into the state of the literature on transparency, and since it’s much easier to publish transparency work in existing CS journals than some of the more agent-foundation-like stuff, and there is more of an existing field to engage with, a lot of things happening have been happening somewhat out of my view (and I think also the other mod’s views). I recently updated upwards on the importance of transparency work for AI Alignment, and I think it’s pretty plausible our policy on this dimension might change, and that after reorienting towards that, I will feel really silly for thinking you didn’t meet the bar.
I think we’re using word “closed” in different ways. I used “closed” to mean that an arbitrary person on the internet had no means of becoming an AF member (on the grounds that “applications” were not being processed and a mod told us that no one was currently responsible for promotions).
But clearly the relevant comparison isn’t “has no means of becoming an AF member”. The bar should be “has no means of submitting a paper/post/comment”, and the bar for that is pretty low for the AIAF. I recognize that there is some awkwardness in commenting or posting first to LessWrong, and there are some issues with perceived reliability, but the bar is overall substantially lower than for most journals, and much easier to clear. As I mentioned above, AF membership itself doesn’t have a great equivalent for traditional journals or conferences, but is overall a bit more like “being on the conference committee” or “being a reviewer for the conference” or “being an editor for the journal”, all of which are much more opaque and harder to get access to, at least in my experience. Just submitting to the AIAF is definitely a bit confusing, but I don’t think overall harder than submitting to a journal.
I am not actually talking about visibility to the broader public, but rather the access of any individual to the discourse, which feels more important to me.
I think I am most confused what you mean by “access to the discourse”. Again, if you comment on LessWrong, people will respond to you. Journal’s don’t usually have any kind of commenting system. AI in particular has a lot of conferences, which helps, though I think most people would still think that getting to one of those conferences and making connections in-person and participating in discussion there is overall a much higher bar than what we have with the LW/AIAF integration. Of course, many people in CS will have already cleared a lot of those bars since they are used to going to conferences, but as a non-academic the bar seems much higher to me.
But is it worse than never having a trial to begin with? Right now people are shut out by default from AF, except by going through LW (and see below).
We discussed this a lot when setting up the forum. My current honest guess is that no trial is indeed better than having a trial, mostly because people do really hate having things taken away from them, and almost everyone we talked to expected that people whose trial would end without membership would quite reliably feel very disappointed or get angry at us, even when we suggested framings that made it very clear that membership should not be expected after a trial.
I also share this from experience. When I did internships where it was made clear to me that I was very unlikely to get a job offer by the end of it, I still felt a really strong sense of disappointment and frustration with the organization when I didn’t end up getting an offer. I simply had invested so much by that point, and getting angry felt like a tempting strategy to push the odds in my favor (I never ended up taking much action based on that frustration, mostly realizing that it felt kind of adversarial and ungrounded, but I expect others would not behave the same way, and having people on the internet scream at you or try to attack you publicly is quite stressful and can also make others much less interested in being part of the forum).
I think my crux here is how passive this feels. It’s mainly a waiting-and-hoping game from the LW side.
I do think the right thing we should tell people here is to post to LW, and if after a day it hasn’t been submitted, to just ping us on Intercom, and then we can give you a straightforward answer on whether it will be promoted within 24 hours. I respond to something like 1-2 requests like this a month, and I could easily handle 10x more, so making that path easier feels like it could make the whole process feel more active.
So it seems like the greatest current advantage of the existing setup, in terms of mod workload, is that you’re crowdsourcing the AF moderation/content promotion via LW.
I don’t think this is exactly true. It’s more that LW is trying to provide an “OK” space for people to submit their AIAF content to that comes with immediate visibility and engagement, and it’s trying to set expectations about the likelihood of content getting promoted. But yeah, the crowdsourced moderation and content promotion is also nice.
can explicitly request that posts/comments are promoted, and I can get feedback on why posts/comments are not promoted.
I think this expectation of feedback is really the primary reason we can’t offer this. Providing feedback for every single piece of content is a massive amount of moderation work. We are talking about 2-3 fulltime people just on this job alone. The only option we would have is to have a system that just accepts and rejects comments and posts, but would do so without any justification for the vast majority of them. I expect this would reliably make people feel angry and even more powerless and then we would have even more people feel like they have no idea how to interact with us, or bang on our door to demand that we explain why we didn’t promote a piece of content.
There is a good reason why there basically exist no other platforms like the AI Alignment Forum on the internet. Content moderation and quality control is a really hard job that reliably has people get angry at you or demand things from you, and if we don’t put in clever systems to somehow reduce that workload or make it less painful, we will either end up drastically lowering our standards, or just burn out and close the forum off completely, the same way the vast majority of similar forums have in the past.
I mean, of course, there are probably clever solutions to this problem, but I do think they don’t look like “just have a submission queue that you then accept/reject and give feedback on”. I think that specific format, while common, also has reliable failure modes that make me very hesitant to use it.
- lifelonglearner 4 Jun 2021 22:21 UTC
  10 points
  Parent
  Just chiming in here to say that I completely forgot about Intercom during this entire series of events, and I wish I had remembered/used it earlier.
  
  (I disabled the button a long time ago, and it has been literal years since I used it last.)
  - habryka 4 Jun 2021 22:50 UTC
    2 points
    Parent
    Checks out. I remember getting some messages from you on there, but upon checking, that was indeed like 3 years ago.
- Peter Hase 6 Jun 2021 0:35 UTC
  5 points
  Parent
  
  But the biggest obstacle is probably just operational capacity.
  
  I see. I know the team has its limits and has already been in a lot of work to propping up AF/LW, which is generally appreciated!
  
  I think I am most confused what you mean by “access to the discourse”.
  
  I mean the ability to freely participate in discussion, by means of directly posting and commenting on threads where the discussion is occurring. Sorry for not making this clearer. I should have more clearly distinguished this from the ability to read the discussion, and the ability to participate in the discussion after external approval.
  
  But clearly the relevant comparison isn’t “has no means of becoming an AF member”. The bar should be “has no means of submitting a paper/post/comment”
  
  Yeah let me try to switch from making this about the definition of “closed” to just an issue about people’s preferences. Some people will be satisfied with the level of access to the AF afforded to them by the current system. Others will not be satisfied with that, and would prefer that they had direct/unrestricted access to the AF. So this is an interesting problem: should the AF set a bar for direct/unrestricted access to the AF, which everyone either meets or does not meet; or should the AF give members direct access, and then given non-members access to the AF via LW for specific posts/comments according to crowdsourced approval or an AF member’s approval? (Of course there are other variants of these). I don’t know what the best answer is, how many people’s preferences are satisfied by either plan, whose preferences matter most, etc.
  
  My current honest guess is that no trial is indeed better than having a trial
  
  I can see why, for the reasons you outline, it would be psychologically worse for everyone to have trials than not have trials. But I think this is a particularly interesting point, because I have a gut-level reaction about communities that aren’t willing to have trials. It triggers some suspicion in me that the community isn’t healthy enough to grow or isn’t interested in growing. Neither of these concerns is necessarily accurate — but I think this is why I predict a negative reaction from other researchers to this news (similar to my original point (2)). Typically people want their ideas to spread and want their ideology to be bolstered by additional voices, and any degree of exclusivity to an academic venue raises alarm bells in my mind about their true motives / the ideological underpinnings of their work. Anyway, these are just some negative reactions, and I think, for me, these are pretty well outweighed by all the other positive inside-view aspects of how I think of the AI safety community.
  
  I do think the right thing we should tell people here is to post to LW, and if after a day it hasn’t been submitted, to just ping us on Intercom, and then we can give you a straightforward answer on whether it will be promoted within 24 hours.
  
  Great!
  
  The only option we would have is to have a system that just accepts and rejects comments and posts, but would do so without any justification for the vast majority of them.
  
  Sorry, isn’t this the current system? Or do you mean something automated? See next comment, which I left automation out from. Right now the promotion system is a black-box from the user’s end, since they don’t know when AF members are looking at posts or how they decide to promote them, in the same way that an automatic system would be a black-box system to a user if they didn’t know how it worked.
  
  There is a good reason why there basically exist no other platforms like the AI Alignment Forum on the internet. Content moderation and quality control is a really hard job that reliably has people get angry at you or demand things from you, and if we don’t put in clever systems to somehow reduce that workload or make it less painful, we will either end up drastically lowering our standards, or just burn out and close the forum off completely, the same way the vast majority of similar forums have in the past.
  
  Yeah, and this is a problem every social media company struggles with, so I don’t want to shame the mod team for struggling with it.
  
  But I do want to emphasize that it’s not a great state to be in to have no recourse systems. Every forum mod team should provide recourse/feedback in reasonable proportion to its available resources. It seems like you’re predicting that users would feel angry/powerless based on a kind of system with limited recourse/feedback, and hence everyone would be worse off with this system. I think something else must occur: without any recourse, the level of anger+powerlessness is high, and as more recourse is added, the amount of these feelings should decline. I think this should happen as long as user expectations are calibrated to what the recourse system can provide. If the forum moves from “no reason for non-promotion, upon request” to “one-sentence reason for non-promotion (and no more!), upon request”, people might complain about the standard but they shouldn’t then feel angry about only getting one sentence (in the sense that their expectations are not being violated, so I don’t think they would be angry). And if users are angry about getting a one-sentence-reason policy, then wouldn’t they be angrier about a no-reason-policy? As long as expectations are set clearly, I can’t imagine a world where increasing the amount of recourse available is bad for the forum.
  
  Maybe this would be a good point to recap, from the mod team’s perspective, what are some ways the AF+LW could more clearly set user expectations about how things work. I think it would also be valuable to specify what happens when things don’t go how users want them to go, and to assess whether any reasonable steps should be taken to increase the transparency of AF content moderation. No need to re-do the whole discussion in the post+comments (i.e. no need to justify any decisions) — I just want to make sure this discussion turns into action items as the mods think are appropriate.
  - habryka 6 Jun 2021 1:43 UTC
    5 points
    Parent
    Sorry, isn’t this the current system? Or do you mean something automated? See next comment, which I left automation out from.
    Sorry, I was suggesting a system in which instead of first posting to LW via the LW interface, you just directly submit to the AIAF, without ever having to think about or go to LW. Then, there is a submission queue that is only visible to some moderators of the AIAF that decides whether your content shows up on both LW and the AIAF, or on neither. This would make it more similar to classical moderated comment-systems. I think a system like this would be clearer to users, since it’s relatively common on the internet, but would also have the problems I described.
    It seems like you’re predicting that users would feel angry/powerless based on a kind of system with limited recourse/feedback, and hence everyone would be worse off with this system.
    One specific problem with having a submission + admin-review system is that the user has to invest a lot of resources into writing a post, and then only after they invested all of those resources do they get to know whether they get any benefit from what they produced and whether their content (which they might have spent dozens of hours writing) is accepted. This is I think one of the primary things that creates a lot of resentment, and when I talk to people considering publishing in various journals, this is often one of the primary reasons they cite for not doing so.
    When designing systems like this, I try to think of ways in which we can give the user feedback at the earliest level of investment, and make incremental benefit available as early as possible. The current system is designed that even if your post doesn’t get promoted to the AIAF, you will likely still get some feedback and benefit from having it on LW. And also, it tries to set expectations that getting a post onto the AIAF is more like a bonus, and the immediate level of reward to expect for the average user, is what you get from posting on LW, which in my experience from user-interviews causes people to publish earlier and faster and get more feedback before getting really invested, in a way that I think results in less resentment overall if it doesn’t get promoted.
    I do think some people see very little reward in posting to LW instead of the AIAF, and for those this system is much worse than for the others. Those users still feel like they have to invest all of this upfront labor to get something onto the AIAF, and then have even less certainty than a normal submission system would provide on whether their content gets promoted, and then have even less recourse than a usual academic submission system would provide. I think it is pretty important for us to think more through the experience of those users, of which I think you are a good representative example.
    Maybe this would be a good point to recap, from the mod team’s perspective, what are some ways the AF+LW could more clearly set user expectations about how things work. I think it would also be valuable to specify what happens when things don’t go how users want them to go, and to assess whether any reasonable steps should be taken to increase the transparency of AF content moderation. No need to re-do the whole discussion in the post+comments (i.e. no need to justify any decisions) — I just want to make sure this discussion turns into action items as the mods think are appropriate.
    I am still thinking through what the right changes we want to make to the system are, but here is a guess on a system that feels good to me:
    We do a trial where non-AF members get a button for “submit a comment to the AIAF” and “submit a post to the AIAF” when they log into the alignmentforum.org website
    When they click that button a tiny box shows up that explains the setup of posting to the AIAF to them. It says something like the following:
    “When you submit a comment or post to the AI Alignment Forum two things happen:
    The post/comment is immediately public and commentable on our sister-platform LessWrong.com, where researchers can immediately provide feedback and thoughts on your submission. You can immediately link to your submission and invite others to comment on it.
    The post/comment enters a review queue that is reviewed within three business days by an admin on whether to accept your submission to the AI Alignment Forum, and if it does not get accepted, the admin will provide you with a short one-sentence explanation for why they made that decision. The admin uses the discussion and reaction on LessWrong to help us judge whether the content is a good fit for the AI Alignment Forum.
    The AI Alignment Forum admins are monitoring all activity on the site, and after you participated in the discussion on the AI Alignment Forum and LessWrong this way, an admin might promote you to a full member of the AI Alignment Forum, who can post to the forum without the need for review, and who can promote other people’s comments and posts from LessWrong.com to the AI Alignment Forum. If you have questions about full membership, or any part of this process, please don’t hesitate to reach out to us (the AIAF admins) via the Intercom in the bottom right corner of the forum.”
    When you finish submitting your comment or post you automatically get redirected to the LW version of the corresponding page where you can see your comment/post live, and it will show (just to you) a small badge saying “awaiting AI Alignment Forum review”
    I think we probably have the capacity to actually handle this submission queue and provide feedback, though this assumption might just turn out to be wrong, in which case I would revert those changes.
    Alternatively, we could provide an option for “either show this content on the AIAF, or show it nowhere”, but I think that would actually end up being kind of messy and complicated, and the setup above strikes me as better. But it does point people quite directly to LessWrong.com in a way that strengthens the association between the two sites in a way that might be costly.
    What links here?
    habryka's comment on The Alignment Forum should have more transparent membership standards by Peter Hase (6 Jun 2021 1:48 UTC; 2 points)
    - ChristianKl 6 Jun 2021 10:21 UTC
      2 points
      Parent
      The post/comment enters a review queue that is reviewed within three business days by an admin on whether to accept your submission to the AI Alignment Forum
      If you believe in Alignment Forum participants making review judgements, how about using a review queue that works more like StackOverflow for this then admin labor?
      I would expect a system that allows Alignment Forum participants to work through the queue to lead to faster reviews and be more easy to scale.
      - habryka 6 Jun 2021 19:45 UTC
        2 points
        Parent
        My general philosophy for things like this is “do it in-house for a while so you understand what kinds of problems come up and make sure you have a good experience. After you really have it down maybe consider outsourcing it, or requesting volunteer labor for it.”
        So I think eventually asking for a more crowdsourced solution seems reasonable, though I think that would come a few months after.