Manfred comments on LW 2.0 Strategic Overview

Manfred Sep 15, 2017, 5:27 PM
14 points
I think votes have served several useful purposes.

Downvotes have been a very good way of enforcing the low-politics norm.

When there’s lots of something, you often want to sort by votes, or some ranking that mixes votes and age. Right now there aren’t many comments per thread, but if there were 100 top-level comments, I’d want votes. Similarly, as a new reader, it was very helpful to me to look for old posts that people had rated highly.
- IlyaShpitser Sep 15, 2017, 6:57 PM
  0 points
  Parent
  How are you going to prevent gaming the system and collusion?
  
  Goodhart’s law: you can game metrics, you can’t game targets. Quality speaks for itself.
  - Kaj_Sotala Sep 16, 2017, 3:34 PM
    16 points
    Parent
    Curious as to why you think that LW2.0 will have a problem with gaming karma when LW1.0 hasn’t had such a problem (unless you count Eugine, and even if you do, we’ve been promised the tools for dealing with Eugines now).
    - Habryka Sep 17, 2017, 12:05 AM
      9 points
      Parent
      I think this roughly summarizes my perspective on this. Karma seems to work well for a very large range of online forums and applications. We didn’t really have any problems with collusion on LW outside of Eugine, and that was a result of a lack of moderator tools, not a problem with the karma system itself.
      
      I agree that you should never fully delegate your decision making process to a simple algorithm, that’s what the value-loading problem is all about, but that’s what we have moderators and admins for. If we see suspicious behavior in the voting patterns we investigate and if we find someone is gaming the system we punish them. This is how practically all social rules and systems get enforced.
    - IlyaShpitser Sep 17, 2017, 3:46 PM
      0 points
      Parent
      LW1.0′s problem with karma is that karma isn’t measuring anything useful (certainly not quality). How can a distributed voting system decide on quality? Quality is not decided by majority vote.
      
      The biggest problem with karma systems is in people’s heads—people think karma does something other than what it does in reality.
      - Kaj_Sotala Sep 17, 2017, 4:02 PM
        7 points
        Parent
        
        LW1.0′s problem with karma is that karma isn’t measuring anything useful (certainly not quality).
        
        That’s the exact opposite of my experience. Higher-voted comments are consistently more insightful and interesting than low-voted ones.
        
        Quality is not decided by majority vote.
        
        Obviously not decided by it, but aggregating lots of individual estimates of quality sure can help discover the quality.
        Vladimir_Nesov Sep 17, 2017, 4:15 PM
        2 points
        Parent
        
        Higher-voted comments are consistently more insightful and interesting than low-voted ones.
        
        This was also my experience (on LW) several years ago, but not recently. On Reddit, I don’t see much difference between highly- and moderately-upvoted comments, only poorly-upvoted comments (in a popular thread) are consistently bad.
        IlyaShpitser Sep 17, 2017, 4:34 PM
        0 points
        Parent
        
        aggregating lots of individual estimates of quality sure can help discover the quality.
        
        I guess we fundamentally disagree. Lots of people with no clue about something aren’t going to magically transform into a method for discerning clue regardless of aggregation method—garbage in garbage out. For example: aggregating learners in machine learning can work, but requires strong conditions.
        John_Maxwell Sep 18, 2017, 5:10 AM
        3 points
        Parent
        Do you disagree with Kaj that higher-voted comments are consistently more insightful and interesting than low-voted ones?
        
        It sounds like you are making a different point: that no voting system is a substitute for having a smart, well-informed userbase. While that is true, that is also not really the problem that a voting system is trying to solve.
        IlyaShpitser Sep 18, 2017, 1:55 PM
        0 points
        Parent
        Sure do. On stuff I know a little about, what gets upvoted is “LW folk wisdom” or perhaps “EY’s weird opinions” rather than anything particularly good. That isn’t surprising. Karma, being a numerical aggregate of the crowd, is just spitting back a view of the crowd on a topic. That is what karma does—nothing to do with quality.
        DragonGod Sep 19, 2017, 4:03 PM
        0 points
        Parent
        What if the view of the crowd is correlated with quality.
        IlyaShpitser Sep 19, 2017, 4:33 PM
        2 points
        Parent
        Every crowd thinks so.
        DragonGod Sep 19, 2017, 4:46 PM
        0 points
        Parent
        I think Lesswrong might be (or at the very least was once) such a place where this is actually true.
        Expand this thread
        IlyaShpitser Sep 19, 2017, 5:37 PM
        0 points
        Parent
        Every crowd thinks they are such a place where it’s actually true. Outside view: they are wrong.
        DragonGod Sep 19, 2017, 5:47 PM
        0 points
        Parent
        
        Every crowd thinks they are such a place where it’s actually true.
        
        Some of the extreme sceptics do not believe they are much closer to the truth than anyone else.
        
        Outside view: they are wrong.
        
        There does not exist a group such that consensus of the group is highly correlated with truth? That’s quite an extraordinary claim you’re making; do you have the appropriate evidence?
        gjm Sep 19, 2017, 10:37 PM
        0 points
        Parent
        I think Ilya is not claiming that no such group exists but that it is well nigh impossible to know that your group is one such. At least where the claim is being made very broadly, as it seems to be upthread. I don’t think it’s unreasonable for experimental physicists to think that their consensus on questions of experimental physics is strongly correlated with truth, for instance, and I bet Ilya doesn’t either.
        
        More specifically, I think the following claim is quite plausible: When a group of people coalesces around some set of controversial ideas (be they political, religious, technological, or whatever), the correlation between group consensus and truth in the area of those controversial ideas may be positive or negative or zero, and members of the group are typically ill-equipped to tell which of these cases they’re in.
        DragonGod Sep 20, 2017, 7:51 AM
        0 points
        Parent
        LW has the best epistemic hygiene of all the communities I’ve encountered and/or participated in.
        
        In so far as epistemic hygiene is positively correlated with truth, I expect LW consensus to be more positively correlated with truth than most (not all) other internet communities.
        Lumifer Sep 20, 2017, 12:24 AM
        0 points
        Parent
        
        members of the group are typically ill-equipped to tell which of these cases they’re in
        
        Doesn’t LW loudly claim to be special in this respect?
        
        And if it actually is not, doesn’t this represent a massive failure of the entire project?
        IlyaShpitser Sep 19, 2017, 9:58 PM
        −1 points
        Parent
        Talking about LW, specifically. Presumably, groups exist that truth-track, for example experts on their area of expertise. LW isn’t an expert group.
        
        The prior on LW is the same as on any other place on the internet, it’s just a place for folks to gab. If LW were extraordinary, truth-wise, they would be sitting on an enormous pile of utility.
        DragonGod Sep 20, 2017, 7:42 AM
        1 point
        Parent
        
        The prior on LW is the same as on any other place on the internet.
        
        I disagree. Epistemic hygiene is genuinely better on LW, and insofar as Epistemic hygiene is positively correlated with truth, I expect LW consensus to be more positively correlated with truth than most (not all) other internet communities.
        Lumifer Sep 20, 2017, 12:32 AM
        0 points
        Parent
        
        Presumably, groups exist that truth-track, for example experts on their area of expertise.
        
        A group of experts will not necessarily truth-track—there are a lot of counterexamples from gender studies to nutrition.
        
        I would probably say that a group which implements its ideas in practice and is exposed to the consequences is likely to truth-track. That’s not LW, but that’s not most of the academia either.
        DragonGod Sep 20, 2017, 7:45 AM
        0 points
        Parent
        I don’t think LW is perfect; I think LW has the best epistemic hygiene of all communities I’ve encountered and/or participated in.
        
        I think epistemic hygiene is positively correlated with truth.
        Kaj_Sotala Sep 18, 2017, 10:40 AM
        0 points
        Parent
        
        Lots of people with no clue about something aren’t going to magically transform into a method for discerning clue regardless of aggregation method—garbage in garbage out.
        
        I think that’s the core of the disagreement: I assume that if the forum is worth reading in the first place, then the average forum user’s opinion of a comment’s quality tends to correlate with my own. In which case something have lots of upvotes is evidence in favor of me also thinking that it will be a good comment.
        
        This assumption does break down if you assume that the other people have “no clue”, but if that’s your opinion of a forum’s users, then why are you reading that forum in the first place?
        IlyaShpitser Sep 18, 2017, 1:57 PM
        4 points
        Parent
        “Clue” is not a total ordering of people from best to worst, it varies from topic to topic.
        
        The other issue to consider is what you view the purpose of a forum is.
        
        Consider a subreddit like TheDonald. Presumably they may use karma to get consensus on what a good comment is, also. But TheDonald is an echo chamber. If your opinions are very correlated with opinions of others in a forum, then naturally you get a number that tells you what everyone agrees is good.
        
        That can be useful, sometimes. But this isn’t quality, it’s just community consensus, and that can be arbitrarily far off. “Less wrong,” as-is-written-on-the-tin is supposedly about something more objective than just coming to a community consensus. You need true signal for that, and karma, being a mirror a community holds to itself, cannot give it to you.
        
        edit: the form of your question is: “if you don’t like TheDonald, why are you reading TheDonald?” Is that what you want to be saying?
      - tristanm Sep 17, 2017, 6:04 PM
        1 point
        Parent
        Hopefully this question is not too much of a digression—but has anyone considered using something like Arxiv-Sanity but instead of for papers it could include content (blog posts, articles, etc.) produced by the wider rationality community? Because at least with that you are measuring similarity to things you have already read and liked, things other people have read and liked, or things people are linking to and commenting on, and you can search things pretty well based on content and authorship. Ranking things by (what people have stored in their library and are planning on taking time to study) might contain more information than karma.
      - DragonGod Sep 17, 2017, 8:14 PM
        0 points
        Parent
        Karma serves as an indicator of the reception that certain content got. High karma means several people liked it. Negative karma means it was very disliked, etc.
  - John_Maxwell Sep 16, 2017, 7:43 AM
    8 points
    Parent
    
    How are you going to prevent gaming the system and collusion?
    
    Keep tweaking the rules until you’ve got a system where the easiest way to get karma is to make quality contributions?
    
    There probably exist karma systems which are provably non-gameable in relevant ways. For example, if upvotes are a conserved quantity (i.e. by upvoting you, I give you 1 upvote and lose 1 of my own upvotes), then you can’t manufacture them from thin air using sockpuppets.
    
    However, it also seems like for a small community, you’re probably better off just moderating by hand. The point of a karma system is to automatically scale moderation up to a much larger number of people, at which point it makes more sense to hash out details. In other worse, maybe I should go try to get a job on reddit’s moderator tools team.
    - IlyaShpitser Sep 16, 2017, 11:34 AM
      2 points
      Parent
      
      Keep tweaking the rules until you’ve got a system where the easiest way to get karma is to make quality contributions?
      
      This will never ever work. Predicting this in advance.
      
      There probably exist karma systems which are provably non-gameable in relevant ways.
      
      You should tell Google and academia, they will be most interested in your ideas. Don’t you think people already thought very hard about this? This is such a typical LW attitude.
      - John_Maxwell Sep 17, 2017, 2:05 AM
        5 points
        Parent
        
        Don’t you think people already thought very hard about this?
        
        Can you show me 3 peer-reviewed papers which discuss discussion site karma systems that differ meaningfully from reddit’s, and 3 discussion sites that implement karma systems that differ from reddit’s in interesting ways? If not, it seems like a neglected topic to me.
        
        Maybe I’m just not very good at doing literature searches. I did a search on Google Scholar for “reddit karma” and found only one paper which focuses on reddit karma. It’s got brilliant insights such as
        
        The aforementioned conflict between idealistically and quantitatively motivated contributions has however led to a discrepancy between value assessments of content.
        
        ...
        
        This is such a typical LW attitude.
        
        I believe Robin Hanson when he says academics neglect topics if they are too weird-seeming. Do you disagree?
        
        It’s certainly plausible that there is academic research relevant to the design of karma systems, but I don’t see why the existence of such research is a compelling reason to not spend 5 minutes thinking about the question from first principles on my own. Relevant quote.
        
        Coincidentally, just a couple days ago I was having a conversation with a math professor here at UC Berkeley about the feasibility of doing research outside of academia. The professor’s opinion was that this is very difficult to do in math, because math is a very “vertical” field where you have to climb to the top before making a contribution, and as long as you are going to spend half a decade or more climbing to the top, you might as well do so within the structure of academia. However, the professor did not think this was true of computer science (see: stuff like Bitcoin which did not come out of academia).
        IlyaShpitser Sep 17, 2017, 3:33 PM
        5 points
        Parent
        
        Maybe I’m just not very good at doing literature searches. I did a search on Google Scholar for “reddit karma” and found only one paper which focuses on reddit karma.
        
        You can’t do lit searches with google. Here’s one paper with a bunch of references on attacks on reputation systems, and reputation systems more generally:
        
        https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/36757.pdf
        
        You are right that lots of folks outside of academia do research on this, in particular game companies (due to toxic players in multiplayer games). This is far from a solved problem—Valve, Riot and Blizzard spend an enormous amount of effort on reputation systems.
        
        I don’t see why the existence of such research is a compelling reason to not spend 5 minutes thinking about the question from first principles on my own.
        
        I don’t think there is a way to write this in a way that doesn’t sound mean: because you are an amateur. Imo, the best way for amateurs to proceed is to (a) trust experts, (b) read expert stuff, and (c) mostly not talk. Changes are, your 5 minute thoughts on the matter are only adding noise to the discussion. In principle, taking expert consensus as the prior is a part of rationality. In practice, people ignore this part because it is not a practice that is fun to follow. It’s much more fun to talk than to read papers.
        
        LW’s love affair with amateurism is one of the things I hate most about its culture.
        
        My favorite episode in the history of science is how science “forgot” what the cure of scurvy was. In order for human civilization not to forget things, we need to be better about (a), (b), (c) above.
        John_Maxwell Sep 18, 2017, 6:05 AM
        9 points
        Parent
        I appreciate the literature pointer.
        
        taking expert consensus as the prior
        
        What expert consensus are you referring to? I see an unsolved engineering problem, not an expert consensus.
        
        My view of amateurism has been formed, in a large part, from reading experts on the topic:
        
        The clash of domains is a particularly fruitful source of ideas. If you know a lot about programming and you start learning about some other field, you’ll probably see problems that software could solve. In fact, you’re doubly likely to find good problems in another domain: (a) the inhabitants of that domain are not as likely as software people to have already solved their problems with software, and (b) since you come into the new domain totally ignorant, you don’t even know what the status quo is to take it for granted.
        
        Paul Graham
        
        Introspection, and an examination of history and of reports of those who have done great work, all seem to show typically the pattern of creativity is as follows. There is first the recognition of the problem in some dim sense. This is followed by a longer or shorter period of refinement of the problem. Do not be too hasty at this stage, as you are likely to put the problem in the conventional form and find only the conventional solution.
        
        Richard Hamming
        
        Synthesize new ideas constantly. Never read passively. Annotate, model, think, and synthesize while you read, even when you’re reading what you conceive to be introductory stuff.
        
        Edward Boyden
        
        This past summer I was working at a startup that does predictive maintenance for internet-connected devices. The CEO has a PhD from Oxford and did his postdoc at Stanford, so probably not an amateur. But working over the summer, I was able to provide a different perspective on the problems that the company had been thinking about for over a year, and a big part of the company’s proposed software stack ended up getting re-envisioned and written from scratch, largely due to my input. So I don’t think it’s ridiculous for me to wonder whether I’d be able to make a similar contribution at Valve/Riot/Blizzard.
        
        The main reason I was able to contribute as much as I did was because I had the gumption to consider the possibility that the company’s existing plans weren’t very good. Basically by going in the exact opposite direction of your “amateurs should stay humble” advice.
        
        Here are some more things I believe:
        
        If you’re solving a problem that is similar to a problem that has already been solved, but is not an exact match, sometimes it takes as much effort to re-work an existing solution as to create a new solution from scratch.
        
        Noise is a matter of place. A comment that is brilliant by the standards of Yahoo Answers might justifiably be downvoted on Less Wrong. It doesn’t make sense to ask that people writing comments on LW try to reach the standard of published academic work.
        
        In computer science, industry is often “ahead” of academia in the sense that important algorithms get discovered in industry first, then academics discover them later and publish their results.
        
        Interested to learn more about your perspective.
        What links here?
        John_Maxwell's comment on Personal thoughts on careers in AI policy and strategy by carrickflynn (EA Forum; Sep 28, 2017, 10:04 AM; 27 points)
        IlyaShpitser Sep 18, 2017, 1:49 PM
        4 points
        Parent
        (a) They also laughed at Bozo the Clown. (I think this is Carl Sagan’s quote).
        
        (b) Outside view: how often do outsiders solve a problem in a novel way, vs just adding noise and cluelessness to the discussion? Base rates! Again, nothing that I am saying is controversial, having good priors is a part of “rationality folklore” already. Going with expert consensus as a prior is a part of “rationality folklore” already. It’s just that people selectively follow rationality practices only when they are fun to follow.
        
        (c) “In computer science, industry is often “ahead” of academia in the sense that important algorithms get discovered in industry first”
        
        Yes, this sometimes happens. But again, base rates. Google/Facebook is full of academia-trained PhDs and ex-professors, so the line here is unclear. It’s not amateurs coming up with these algorithms. John Tukey came up with the Fast Fourier Transform while at Bell Labs, but he was John Tukey, and had a math PhD from Princeton.
        DragonGod Sep 17, 2017, 8:22 PM
        4 points
        Parent
        (Upvoted).
        
        Changes are, your 5 minute thoughts on the matter are only adding noise to the discussion.
        
        This is where we differ; I think the potential for substantial contribution vastly outweighs any “noise” that may be be caused by amateurs taking stabs at he problem. I do not think all the low hanging fruit are gone (and if they were, how would we know so?), I think that amateurs are capable of substantial contributions in several fields. I think that optimism towards open problems is a more productive attitude.
        
        I support “LW’s love affair with amateurism”, and it’s a part of the culture I wouldn’t want to see disappear.
      - DragonGod Sep 16, 2017, 5:03 PM
        4 points
        Parent
        
        You should tell Google and academia, they will be most interested in your ideas. Don’t you think people already thought very hard about this? This is such a typical LW attitude.
        
        This reply contributes nothing to the discussion of the problem at hand, and is quite uncharitable, I hope such replies were discouraged, and if downvoting was enabled, I would have downvoted it.
        
        If thinking that they can solve the problem at hand (and making attempts at it) is a “typical LW attitude”, then it is an attitude I want to see more of and believe should be encouraged (thus, I’ll be upvoting /u/John_Maxwell_IV ’s post). A priori assuming that one cannot solve a problem (that hasn’t been proven/isnt known to be unsolvable) and thus refraining from even attempting the problem, isn’t an attitude that I want to see become the norm in Lesswrong. It’s not an attitude that I think is useful, productive, optimal or efficient.
        
        It is my opinion, that we want to encourage people to attempt problems of interest to the community (the potential benefits are vast (e.g the problem is solved, and/or significant improvements are made on the problem, and future endeavours would have a better starting point), and the potential demerits are of lesser impact (time (ours and whoever attempts it) is wasted on an unpromising solution).
        
        Coming back to the topic that was being discussed, I think methods of costly signalling are promising (for example, when you upvote a post you transfer X karma to the user, and you lose k*X (k < 1)).
        What links here?
        Vladimir_Nesov's comment on LW 2.0 Strategic Overview by habryka (Sep 16, 2017, 6:09 PM; 0 points)
        IlyaShpitser Sep 16, 2017, 11:16 PM
        2 points
        Parent
        I have been here for a few years, I think my model of “the LW mindset” is fairly good.
        
        I suppose the general thing I am trying to say is: “speak less, read more.” But at the end of the day, this sort of advice is hopelessly entangled with status considerations. So it’s hard to give to a stranger, and have it be received well. Only really works in the context of an existing apprenticeship relationship.
        DragonGod Sep 17, 2017, 12:58 AM
        0 points
        Parent
        Status games outside, the sentiment expressed in my reply are my real views on the matter.
        Vladimir_Nesov Sep 16, 2017, 5:17 PM
        0 points
        Parent
        
        A priori assuming that one cannot solve a problem
        
        (“A priori” suggests lack of knowledge to temper an initial impression, which doesn’t apply here.)
        
        There are problems one can’t by default solve, and a statement, standing on its own, that it’s feasible to solve them is known to be wrong. A “useful attitude” of believing something wrong is a popular stance, but is it good? How does its usefulness work, specifically, if it does, and can we get the benefits without the ugliness?
        DragonGod Sep 16, 2017, 5:54 PM
        0 points
        Parent
        
        that hasn’t been proven/isnt known to be unsolvable)
        
        An optimistic attitude towards problems that are potentially solvable is instrumentally useful—and dare I argue—instrumentally rational. The drawbacks of encouraging an optimistic attitude towards open problems are far outweighed by the potential benefits.
        Vladimir_Nesov Sep 16, 2017, 6:09 PM
        0 points
        Parent
        (The quote markup in your comment designates a quote from your earlier comment, not my comment.)
        
        You are not engaging the distinction I’ve drawn. Saying “It’s useful” isn’t the final analysis, there are potential improvements that avoid the horror of intentionally holding and professing false beliefs (to the point of disapproving of other people pointing out their falsehood; this happened in your reply to Ilya).
        
        The problem of improving over the stance of an “optimistic attitude” might be solvable.
        DragonGod Sep 16, 2017, 8:32 PM
        0 points
        Parent
        
        (The quote markup in your comment designates a quote from your earlier comment, not my comment.)
        
        I know: I was quoting myself.
        
        Saying “It’s useful” isn’t the final analysis
        
        I guess for me it is.
        
        there are potential improvements that avoid the horror of intentionally holding and professing false beliefs (to the point of disapproving of other people pointing out their falsehood; this happened in your reply to Ilya)
        
        The beliefs aren’t known to be false. It is not clear to me, that someone believing they can solve a problem (that isn’t known/proven or even strongly suspected to be unsolvable) is a false belief.
        
        What do you propose to replace the optimism I suggest?
  - Manfred Sep 15, 2017, 7:58 PM
    2 points
    Parent
    Moderation is basically the only way, I think. You could try to use fancy pagerank-anchored-by-trusted-users ratings, or make votes costly to the user in some way, but I think moderation is the necessary fallback.
    
    Goodhart’s law is real, but people still try to use metrics. Quality may speak for itself, but it can be too costly to listen to the quality of every single thing anyone says.
    - IlyaShpitser Sep 15, 2017, 7:59 PM
      1 point
      Parent
      People use name recognition in practice, works pretty well.
      - Kaj_Sotala Sep 17, 2017, 10:53 AM
        7 points
        Parent
        I can use name recognition to scroll through a comment thread to find all the comments by the people that I consider in high regard, but this is much more effort than just having a karma system which automatically shows the top-voted comments first. (The karma system also doesn’t discriminate against new writers as badly as relying on name recognition does.)
      - tristanm Sep 16, 2017, 9:55 PM
        3 points
        Parent
        Going to reply to this because I don’t think it should be overlooked. It’s a valid point—people tend to want to filter out information that’s not from the sources they trust. I think these kind of incentive pressures are what led to the “LessWrong Diaspora” being concentrated around specific blogs belonging to people with very positive reputation such as Scott Alexander. And when people want to look at different sources of information they will follow the advice of said people usually. This is how I operate when I’m doing my own reading / research—I start somewhere I consider to be the “safest” and move out from there according to the references given at that spot and perhaps a few more steps outward.
        
        When we use a karma / voting system, we are basically trying to calculate P(this contains useful information | this post has a high number of votes) but no voting system ever offers as much evidence as a specific reference from someone we recognize as trustworthy. The only way to increase the evidence gained from a voting system is to add further complexity to the system by increasing the amount of information contained in a vote, either by weighing the votes or by identifying the person behind the vote. And then from there you can add more to a vote, like a specific comment or a more nuanced judgement. I think the end of that track is basically what we have now, blogs by a specific person linking to other blogs, or social media like Facebook where no user is anonymous and everyone has their information filtered in some way.
        
        Essentially I’m saying we should not ignore the role that optimization pressure has played in producing the systems we already have.
    - Vladimir_Nesov Sep 15, 2017, 10:17 PM
      0 points
      Parent
      
      Quality may speak for itself, but it can be too costly to listen to the quality of every single thing anyone says.
      
      Which is why there should be a way to vote on users, not content, the quantity of unevaluated content shouldn’t divide the signal. This would matter if the primary mission succeeds and there is actual conversation worth protecting.