Habryka’s Shortform Feed

habryka27 Apr 2019 19:25 UTC

LW: 56 AF: 12

In an attempt to get myself to write more here is my own shortform feed. Ideally I would write something daily, but we will see how it goes.

What links here?

Raemon's comment on Drowning children are rare by Benquo (29 May 2019 5:06 UTC; 14 points)

habryka27 Apr 2019 19:25 UTC

LW: 56 AF: 12

516 comments1 min readLW link

habryka 2 Jul 2024 7:59 UTC
404 points
67
I am confident, on the basis of private information I can’t share, that Anthropic has asked at least some employees to sign similar non-disparagement agreements that are covered by non-disclosure agreements as OpenAI did.
Or to put things into more plain terms:
I am confident that Anthropic has offered at least one employee significant financial incentive to promise to never say anything bad about Anthropic, or anything that might negatively affect its business, and to never tell anyone about their commitment to do so.
I am not aware of Anthropic doing anything like withholding vested equity the way OpenAI did, though I think the effect on discourse is similarly bad.
I of course think this is quite sad and a bad thing for a leading AI capability company to do, especially one that bills itself on being held accountable by its employees and that claims to prioritize safety in its plans.
What links here?
- Sam McCandlish 4 Jul 2024 4:26 UTC
  171 points
  29
  Parent
  Hey all, Anthropic cofounder here. I wanted to clarify Anthropic’s position on non-disparagement agreements:
  1. We have never tied non-disparagement agreements to vested equity: this would be highly unusual. Employees or former employees never risked losing their vested equity for criticizing the company.
  2. We historically included standard non-disparagement terms by default in severance agreements, and in some non-US employment contracts. We’ve since recognized that this routine use of non-disparagement agreements, even in these narrow cases, conflicts with our mission. Since June 1st we’ve been going through our standard agreements and removing these terms.
  3. Anyone who has signed a non-disparagement agreement with Anthropic is free to state that fact (and we regret that some previous agreements were unclear on this point). If someone signed a non-disparagement agreement in the past and wants to raise concerns about safety at Anthropic, we welcome that feedback and will not enforce the non-disparagement agreement.
  In other words— we’re not here to play games with AI safety using legal contracts. Anthropic’s whole reason for existing is to increase the chance that AI goes well, and spur a race to the top on AI safety.
  Some other examples of things we’ve needed to adjust from the standard corporate boilerplate to ensure compatibility with our mission: (1) replacing standard shareholder governance with the Long Term Benefit Trust and (2) supplementing standard risk management with the Responsible Scaling Policy. And internally, we have an anonymous RSP non-compliance reporting line so that any employee can raise concerns about issues like this without any fear of retaliation.
  Please keep up the pressure on us and other AI developers: standard corporate best practices won’t cut it when the stakes are this high. Our goal is to set a new standard for governance in AI development. This includes fostering open dialogue, prioritizing long-term safety, making our safety practices transparent, and continuously refining our practices to align with our mission.
  What links here?
  - Zach Stein-Perlman 4 Jul 2024 19:30 UTC
    129 points
    56
    Parent
    Please keep up the pressure on us
    OK:
    You should publicly confirm that your old policy don’t meaningfully advance the frontier with a public launch has been replaced by your RSP, if that’s true, and otherwise clarify your policy.
    You take credit for the LTBT (e.g. here) but you haven’t published enough to show that it’s effective. You should publish the Trust Agreement, clarify these ambiguities, and make accountability-y commitments like if major changes happen to the LTBT we’ll quickly tell the public.
    (Reminder that a year ago you committed to establish a bug bounty program (for model issues) or similar but haven’t. But I don’t think bug bounties are super important.)
    [Edit: bug bounties are also mentioned in your RSP—in association with ASL-2—but not explicitly committed to.]
    (Good job in many areas.)
    - Bird Concept 4 Jul 2024 20:13 UTC
      38 points
      71
      Parent
      (Sidenote: it seems Sam was kind of explicitly asking to be pressured, so your comment seems legit :)
      But I also think that, had Sam not done so, I would still really appreciate him showing up and responding to Oli’s top-level post, and I think it should be fine for folks from companies to show up and engage with the topic at hand (NDAs), without also having to do a general AMA about all kinds of other aspects of their strategy and policies. If Zach’s questions do get very upvoted, though, it might suggest there’s demand for some kind of Anthropic AMA event.)
  - habryka 5 Jul 2024 16:16 UTC
    105 points
    25
    Parent
    Anyone who has signed a non-disparagement agreement with Anthropic is free to state that fact (and we regret that some previous agreements were unclear on this point) [emphasis added]
    This seems as far as I can tell a straightforward lie?
    I am very confident that the non-disparagement agreements you asked at least one employee to sign were not ambiguous, and very clearly said that the non-disparagement clauses could not be mentioned.
    To reiterate what I know to be true: Employees of Anthropic were asked to sign non-disparagement agreements with a commitment to never tell anyone about the presence of those non-disparagement agreements. There was no ambiguity in the agreements that I have seen.
    @Sam McCandlish: Please clarify what you meant to communicate by the above, which I interpreted as claiming that there was merely ambiguity in previous agreements about whether the non-disparagement agreements could be disclosed, which seems to me demonstrably false.
    What links here?
    Leaving Open Philanthropy, going to Anthropic by Joe_Carlsmith (EA Forum; 3 Nov 2025 17:41 UTC; 137 points)
    Leaving Open Philanthropy, going to Anthropic by Joe Carlsmith (3 Nov 2025 17:38 UTC; 111 points)
    Greg_Colbourn ⏸️ 's comment on Leaving Open Philanthropy, going to Anthropic by Joe_Carlsmith (EA Forum; 4 Nov 2025 16:58 UTC; 21 points)
    - Neel Nanda 12 Jul 2024 7:22 UTC
      39 points
      2
      Parent
      I can confirm that my concealed non-disparagement was very explicit that I could not discuss the existence or terms of the agreement, I don’t see any way I could be misinterpreting this. (but I have now kindly been released from it!)
      
      EDIT: It wouldn’t massively surprise me if Sam just wasn’t aware of its existence though
    - Sam McCandlish 9 Jul 2024 1:43 UTC
      0 points
      0
      Parent
      We’re not claiming that Anthropic never offered a confidential non-disparagement agreement. What we are saying is: everyone is now free to talk about having signed a non-disparagement agreement with us, regardless of whether there was a non-disclosure previously preventing it. (We will of course continue to honor all of Anthropic’s non-disparagement and non-disclosure obligations, e.g. from mutual agreements.)
      If you’ve signed one of these agreements and have concerns about it, please email hr@anthropic.com.
      What links here?
      Leaving Open Philanthropy, going to Anthropic by Joe_Carlsmith (EA Forum; 3 Nov 2025 17:41 UTC; 137 points)
      Leaving Open Philanthropy, going to Anthropic by Joe Carlsmith (3 Nov 2025 17:38 UTC; 111 points)
      Ben Pace's comment on Zach Stein-Perlman’s Shortform by Zach Stein-Perlman (25 Aug 2024 19:03 UTC; 44 points)
      Greg_Colbourn ⏸️ 's comment on Leaving Open Philanthropy, going to Anthropic by Joe_Carlsmith (EA Forum; 4 Nov 2025 16:58 UTC; 21 points)
      - habryka 9 Jul 2024 2:15 UTC
        65 points
        22
        Parent
        Hmm, I feel like you didn’t answer my question. Can you confirm that Anthropic has asked at least some employees to sign confidential non-disparagement agreements?
        I think your previous comment pretty strongly implied that you think you did not do so (i.e. saying any previous agreements were merely “unclear” I think pretty clearly implies that none of them did include a non-ambiguous confidential non-disparagement agreement). I want to it to be confirmed and on the record that you did, so I am asking you to say so clearly.
    - lemonhope 9 Jul 2024 11:01 UTC
      −5 points
      −5
      Parent
      “Unclear on this point” means what you think it means and is not a L I E for a spokesperson to say in my book. You got the W here already
      - habryka 9 Jul 2024 17:01 UTC
        21 points
        1
        Parent
        I really think the above was meant to imply that the non disparagement agreements were merely unclear on whether they were covered by a non disclosure clause (and I would be happy to take bets on how a randomly selected reader would interpret it).
        
        My best guess is Sam was genuinely confused on this and that there are non disparagement agreements with Anthropic that clearly are not covered by such clauses.
  - Neel Nanda 4 Jul 2024 21:31 UTC
    95 points
    31
    Parent
    EDIT: Anthropic have kindly released me personally from my entire concealed non-disparagement, not just made a specific safety exception. Their position on other employees remains unclear, but I take this as a good sign
    
    If someone signed a non-disparagement agreement in the past and wants to raise concerns about safety at Anthropic, we welcome that feedback and will not enforce the non-disparagement agreement.
    
    Thanks for this update! To clarify, are you saying that you WILL enforce existing non disparagements for everything apart from safety, but you are specifically making an exception for safety?
    
    this routine use of non-disparagement agreements, even in these narrow cases, conflicts with our mission
    
    Given this part, I find this surprising. Surely if you think it’s bad to ask future employees to sign non disparagements you should also want to free past employees from them too?
  - aysja 6 Jul 2024 0:00 UTC
    50 points
    37
    Parent
    This comment appears to respond to habryka, but doesn’t actually address what I took to be his two main points—that Anthropic was using NDAs to cover non-disparagement agreements, and that they were applying significant financial incentive to pressure employees into signing them.
    We historically included standard non-disparagement agreements by default in severance agreements
    Were these agreements subject to NDA? And were all departing employees asked to sign them, or just some? If the latter, what determined who was asked to sign?
  - mesaoptimizer 4 Jul 2024 18:41 UTC
    37 points
    29
    Parent
    
    Anyone who has signed a non-disparagement agreement with Anthropic is free to state that fact (and we regret that some previous agreements were unclear on this point).
    
    I’m curious as to why it took you (and therefore Anthropic) so long to make it common knowledge (or even public knowledge) that Anthropic used non-disparagement contracts as a standard and was also planning to change its standard agreements.
    
    The right time to reveal this was when the OpenAI non-disparagement news broke, not after Habryka connects the dots and builds social momentum for scrutiny of Anthropic.
    - habryka 4 Jul 2024 19:13 UTC
      37 points
      29
      Parent
      that Anthropic used non-disparagement contracts as a standard and was also planning to change its standard agreements.
      I do want to be clear that a major issue is that Anthropic used non-disparagement agreements that were covered by non-disclosure agreements. I think that’s an additionally much more insidious thing to do, that contributed substantially to the harm caused by the OpenAI agreements, and I think is important fact to include here (and also makes the two situations even more analogous).
  - Neel Nanda 5 Jul 2024 9:10 UTC
    21 points
    1
    Parent
    Note, since this is a new and unverified account, that Jack Clark (Anthropic co-founder) confirmed on Twitter that the parent comment is the official Anthropic position https://x.com/jackclarkSF/status/1808975582832832973
  - habryka 4 Jul 2024 17:55 UTC
    17 points
    7
    Parent
    Thank you for responding! (I have more comments and questions but figured I would shoot off one quick question which is easy to ask)
    We’ve since recognized that this routine use of non-disparagement agreements, even in these narrow cases, conflicts with our mission
    Can you clarify what you mean by “even in these narrow cases”? If I am understanding you correctly, you are saying that you were including a non-disparagement clause by default in all of your severance agreements, which sounds like the opposite of narrow (edit: though as Robert points out it depends on what fraction of employees get offered any kind of severance, which might be most, or might be very few).
    I agree that it would have technically been possible for you to also include such an agreement on start of employment, but that would have been very weird, and not even OpenAI did that.
    ~~I think using the sentence “even in these narrow cases” seems inappropriate given that (if I am understanding you correctly) all past employees were affected by these agreements~~. I think it would be good to clarify what fraction of past employees were actually offered these agreements.
    - RobertM 4 Jul 2024 19:18 UTC
      37 points
      9
      Parent
      Severance agreements typically aren’t offered to all departing employees, but usually only those that are fired or laid off. We know that not all past employees were affected by these agreements, because Ivan claims to not have been offered such an agreement, and he left^[1] in mid-2023, which was well before June 1st.
      ^
      Presumably of his own volition, hence no offered severance agreement with non-disparagement clauses.
      - habryka 4 Jul 2024 19:25 UTC
        3 points
        0
        Parent
        Ah, fair, that would definitely make the statement substantially more accurate.
        @Sam McCandlish: Could you clarify whether severance agreements were also offered to voluntarily departing employees, and if so, under which conditions?
  - kave 4 Jul 2024 19:55 UTC
    15 points
    12
    Parent
    To expand on my “that’s a crux”: if the non-disparagement+NDA clauses are very standard, such that they were included in a first draft by an attorney without prompting and no employee ever pushed back, then I would think this was somewhat less bad.
    It would still be somewhat bad, because Anthropic should be proactive about not making those kinds of mistakes. I am confused about what level of perfection to demand from Anthropic, considering the stakes.
    And if non-disparagement is often used, but Anthropic leadership either specified its presence or its form, that would seem quite bad to me, because mistakes of commision here are more evidence of poor decisionmaking than mistakes of omission. If Anthropic leadership decided to keep the clause when a departing employee wanted to remove the clause, that would similarly seem quite bad to me.
    - nwinter 5 Jul 2024 19:43 UTC
      37 points
      7
      Parent
      I think that both these clauses are very standard in such agreements. Both severance letter templates I was given for my startup, one from a top-tier SV investor’s HR function and another from a top-tier SV law firm, had both clauses. When I asked Claude, it estimated 70-80% of startups would have a similar non-disparagement clause and 80-90% would have a similar confidentiality-of-this-agreement’s-terms clause. The three top Google hits for “severance agreement template” all included those clauses.
      These generally aren’t malicious. Terminations get messy and departing employees often have a warped or incomplete picture of why they were terminated–it’s not a good idea to tell them all those details, because that adds liability, and some of those details are themselves confidential about other employees. Companies view the limitation of liability from release of various wrongful termination claims as part of the value they’re “purchasing” by offering severance–not because those claims would succeed, but because it’s expensive to explain in court why they’re justified. But the expenses disgruntled ex-employees can cause is not just legal, it’s also reputational. You usually don’t know which ex-employee will get salty and start telling their side of the story publicly, where you can’t easily respond with your side without opening up liability. Non-disparagement helps cover that side of it. And if you want to disparage the company, in a standard severance letter that doesn’t claw back vested equity, hey, you’re free to just not sign it–it’s likely only a bonus few weeks/months’ salary that you didn’t yet earn on the line, not the value of all the equity you had already vested. We shouldn’t conflate the OpenAI situation with Anthropic’s given the huge difference in stakes.
      Confidentiality clauses are standard because they prevent other employees from learning the severance terms and potentially demanding similar treatment in potentially dissimilar situations, thus helping the company control costs and negotiations in future separations. They typically cover the entire agreement and are mostly about the financial severance terms. I imagine that departing employees who cared could’ve ask the company for a carve-out on the confidentiality for the non-disparagement clause as a very minor point of negotiation.
      It’s great that Anthropic is taking steps to make these docs more departing-employee-friendly. I wouldn’t read too much into that the docs were like this in the first place (as this wasn’t on cultural radars until very recently) or that they weren’t immediately changed (legal stuff takes time and this was much smaller in scope than in the OpenAI case).
      Example clauses in default severance letter from my law firm:
      7. Non-Disparagement. You agree that you will not make any false, disparaging or derogatory statements to any media outlet, industry group, financial institution or current or former employees, consultants, clients or customers of the Company, regarding the Company, including with respect to the Company, its directors, officers, employees, agents or representatives or about the Company’s business affairs and financial condition.
      11. Confidentiality. To the extent permitted by law, you understand and agree that as a condition for payment to you of the severance benefits herein described, the terms and contents of this letter agreement, and the contents of the negotiations and discussions resulting in this letter agreement, shall be maintained as confidential by you and your agents and representatives and shall not be disclosed except to the extent required by federal or state law or as otherwise agreed to in writing by the Company.
  - elifland 4 Jul 2024 19:33 UTC
    11 points
    3
    Parent
    And internally, we have an anonymous RSP non-compliance reporting line so that any employee can raise concerns about issues like this without any fear of retaliation.
    
    Are you able to elaborate on how this works? Are there any other details about this publicly, couldn’t find more detail via a quick search.
    Some specific qs I’m curious about: (a) who handles the anonymous complaints, (b) what is the scope of behavior explicitly (and implicitly re: cultural norms) covered here, (c) handling situations where a report would deanonymize the reporter (or limit them to a small number of people)?
    - Zach Stein-Perlman 4 Jul 2024 19:38 UTC
      5 points
      2
      Parent
      Anthropic has not published details. See discussion here. (I weakly wish they would; it’s not among my high-priority asks for them.)
      - Zac Hatfield-Dodds 4 Jul 2024 20:16 UTC
        70 points
        −2
        Parent
        OK, let’s imagine I had a concern about RSP noncompliance, and felt that I needed to use this mechanism.
        
        (in reality I’d just post in whichever slack channel seemed most appropriate; this happens occasionally for “just wanted to check...” style concerns and I’m very confident we’d welcome graver reports too. Usually that’d be a public channel; for some compartmentalized stuff it might be a private channel and I’d DM the team lead if I didn’t have access. I think we have good norms and culture around explicitly raising safety concerns and taking them seriously.)
        
        As I understand it, I’d:
        
        Remember that we have such a mechanism and bet that there’s a shortcut link. Fail to remember the shortlink name (reports? violations?) and search the list of “rsp-” links; ah, it’s rsp-noncompliance. (just did this, and added a few aliases)
        That lands me on the policy PDF, which explains in two pages the intended scope of the policy, who’s covered, the proceedure, etc. and contains a link to the third-party anonymous reporting platform. That link is publicly accessible, so I could e.g. make a report from a non-work device or even after leaving the company.
        I write a report on that platform describing my concerns^[1], optionally uploading documents etc. and get a random password so I can log in later to give updates, send and receive messages, etc.
        The report by default goes to our Responsible Scaling Officer, currently Sam McCandlish. If I’m concerned about the RSO or don’t trust them to handle it, I can instead escalate to the Board of Directors (current DRI Daniella Amodei)
        Investigation and resolution obviously depends on the details of the noncompliance concern.
        
        There are other (pretty standard) escalation pathways for concerns about things that aren’t RSP noncompliance. There’s not much we can do about the “only one person could have made this report” problem beyond the included strong commitments to non-retaliation, but if anyone has suggestions I’d love to hear them.
        
        ↩︎
        I clicked through just now to the point of cursor-in-textbox, but not submitting a nuisance report.
        
        William_S 5 Jul 2024 17:53 UTC
        11 points
        19
        Parent
        Good that it’s clear who it goes to, though if I was an anthropic I’d want an option to escalate to a board member who isn’t Dario or Daniella, in case I had concerns related to the CEO
        Zac Hatfield-Dodds 5 Jul 2024 20:04 UTC
        11 points
        2
        Parent
        Makes sense—if I felt I had to use an anonymous mechanism, I can see how contacting Daniela about Dario might be uncomfortable. (Although to be clear I actually think that’d be fine, and I’d also have to think that Sam McCandlish as responsible scaling officer wouldn’t handle it)
        
        If I was doing this today I guess I’d email another board member; and I’ll suggest that we add that as an escalation option.
        Raemon 5 Jul 2024 20:31 UTC
        16 points
        11
        Parent
        Are there currently board members who are meaningfully separated in terms of incentive-alignment with Daniella or Dario? (I don’t know that it’s possible for you to answer in a way that’d really resolve my concerns, given what sort of information is possible to share. But, “is there an actual way to criticize Dario and/or Daniella in a way that will realistically be given a fair hearing by someone who, if appropriate, could take some kind of action” is a crux of mine)
        William_S 5 Jul 2024 23:56 UTC
        5 points
        5
        Parent
        Absent evidence to the contrary, for any organization one should assume board members were basically selected by the CEO. So hard to get assurance about true independence, but it seems good to at least to talk to someone who isn’t a family member/close friend.
        Zach Stein-Perlman 6 Jul 2024 0:03 UTC
        7 points
        −2
        Parent
        (Jay Kreps was formally selected by the LTBT. I think Yasmin Razavi was selected by the Series C investors. It’s not clear how involved the leadership/Amodeis were in those selections. The three remaining members of the LTBT appear independent, at least on cursory inspection.)
        Zac Hatfield-Dodds 6 Jul 2024 0:32 UTC
        2 points
        −7
        Parent
        I think that personal incentives is an unhelpful way to try and think about or predict board behavior (for Anthropic and in general), but you can find the current members of our board listed here.
        
        Is there an actual way to criticize Dario and/or Daniela in a way that will realistically be given a fair hearing by someone who, if appropriate, could take some kind of action?
        
        For whom to criticize him/her/them about what? What kind of action are you imagining? For anything I can imagine actually coming up, I’d be personally comfortable raising it directly with either or both of them in person or in writing, and believe they’d give it a fair hearing as well as appropriate follow-up. There are also standard company mechanisms that many people might be more comfortable using (talk to your manager or someone responsible for that area; ask a maybe-anonymous question in various fora; etc). Ultimately executives are accountable to the board, which will be majority appointed by the long-term benefit trust from late this year.
  - Zach Stein-Perlman 4 Jul 2024 19:22 UTC
    2 points
    3
    Parent
    Re 3 (and 1): yay.
    If I was in charge of Anthropic I just wouldn’t use non-disparagement.
- Sam Marks 30 Jun 2024 19:12 UTC
  35 points
  11
  Parent
  Anthropic has asked employees
  [...]
  Anthropic has offered at least one employee
  As a point of clarification: is it correct that the first quoted statement above should be read as “at least one employee” in line with the second quoted statement? (When I first read it, I parsed it as “all employees” which was very confusing since I carefully read my contract both before signing and a few days ago (before posting this comment) and I’m pretty sure there wasn’t anything like this in there.)
  What links here?
  - RobertM's comment on Habryka’s Shortform Feed by habryka (1 Jul 2024 1:36 UTC; 7 points)
  - Vladimir_Nesov 30 Jun 2024 21:22 UTC
    17 points
    17
    Parent
    
    (I’m a full-time employee at Anthropic.)
    I carefully read my contract both before signing and a few days ago [...] there wasn’t anything like this in there.
    
    Current employees of OpenAI also wouldn’t yet have signed or even known about the non-disparagement agreement that is part of “general release” paperwork on leaving the company. So this is only evidence about some ways this could work at Anthropic, not others.
  - habryka 30 Jun 2024 21:18 UTC
    6 points
    0
    Parent
    Yep, both should be read as “at least one employee”, sorry for the ambiguity in the language.
    - DanielFilan 30 Jun 2024 21:24 UTC
      13 points
      8
      Parent
      FWIW I recommend editing OP to clarify this.
      - Neel Nanda 30 Jun 2024 22:34 UTC
        2 points
        0
        Parent
        Agreed, I think it’s quite confusing as is
        habryka 30 Jun 2024 22:57 UTC
        4 points
        1
        Parent
        Added a “at least some”, which I hope clarifies.
- Zach Stein-Perlman 30 Jun 2024 22:15 UTC
  34 points
  30
  Parent
  I am disappointed. Using nondisparagement agreements seems bad to me, especially if they’re covered by non-disclosure agreements, especially if you don’t announce that you might use this.
  My ask-for-Anthropic now is to explain the contexts in which they have asked or might ask people to incur nondisparagement obligations, and if those are bad, release people and change policy accordingly. And even if nondisparagement obligations can be reasonable, I fail to imagine how non-disclosure obligations covering them could be reasonable, so I think Anthropic should at least do away with the no-disclosure-of-nondisparagement obligations.
- Bird Concept 1 Jul 2024 1:11 UTC
  33 points
  1
  Parent
  Does anyone from Anthropic want to explicitly deny that they are under an agreement like this?
  (I know the post talks about some and not necessarily all employees, but am still interested).
  - Ivan Vendrov 1 Jul 2024 5:33 UTC
    140 points
    12
    Parent
    I left Anthropic in June 2023 and am not under any such agreement.
    EDIT: nor was any such agreement or incentive offered to me.
    What links here?
    RobertM's comment on Habryka’s Shortform Feed by habryka (4 Jul 2024 19:18 UTC; 37 points)
    - Vladimir_Nesov 1 Jul 2024 16:02 UTC
      15 points
      3
      Parent
      
      I left [...] and am not under any such agreement.
      
      Neither is Daniel Kokotajlo. Context and wording strongly suggest that what you mean is that you weren’t ever offered paperwork with such an agreement and incentives to sign it, but there remains a slight ambiguity on this crucial detail.
      - Ivan Vendrov 1 Jul 2024 16:31 UTC
        27 points
        4
        Parent
        Correct, I was not offered such paperwork nor any incentives to sign it. Edited my post to include this.
  - Zac Hatfield-Dodds 1 Jul 2024 5:55 UTC
    76 points
    16
    Parent
    I am a current Anthropic employee, and I am not under any such agreement, nor has any such agreement ever been offered to me.
    
    If asked to sign a self-concealing NDA or non-disparagement agreement, I would refuse.
  - RobertM 1 Jul 2024 1:36 UTC
    7 points
    0
    Parent
    Did you see Sam’s comment?
  - aysja 1 Jul 2024 1:38 UTC
    6 points
    10
    Parent
    Agreed. I’d be especially interested to hear this from people who have left Anthropic.
- Neel Nanda 12 Jul 2024 7:24 UTC
  31 points
  0
  Parent
  This is true. I signed a concealed non-disparagement when I left Anthropic in mid 2022. I don’t have clear evidence this happened to anyone else (but that’s not strong evidence of absence). More details here
  
  EDIT: I should also clarify that I personally don’t think Anthropic acted that badly, and recommend reading about what actually happened before forming judgements. I do not think I am the person referred to in Habryka’s comment.
- William_S 1 Jul 2024 18:59 UTC
  31 points
  17
  Parent
  I agree that this kind of legal contract is bad, and Anthropic should do better. I think there are a number of aggrevating factors which made the OpenAI situation extrodinarily bad, and I’m not sure how much these might obtain regarding Anthropic (at least one comment from another departing employee about not being offered this kind of contract suggest the practice is less widespread).
  
  -amount of money at stake
  -taking money, equity or other things the employee believed they already owned if the employee doesn’t sign the contract, vs. offering them something new (IANAL but in some cases, this could be a felony “grand theft wages” under California law if a threat to withhold wages for not signing a contract is actually carried out, what kinds of equity count as wages would be a complex legal question)
  -is this offered to everyone, or only under circumstances where there’s a reasonable justification?
  -is this only offered when someone is fired or also when someone resigns?
  -to what degree are the policies of offering contracts concealed from employees?
  -if someone asks to obtain legal advice and/or negotiate before signing, does the company allow this?
  -if this becomes public, does the company try to deflect/minimize/only address issues that are made publically, or do they fix the whole situation?
  -is this close to “standard practice” (which doesn’t make it right, but makes it at least seem less deliberately malicious), or is it worse than standard practice?
  -are there carveouts that reduce the scope of the non-disparagement clause (explicitly allow some kinds of speech, overriding the non-disparagement)?
  -are there substantive concerns that the employee has at the time of signing the contract, that the agreement would prevent discussing?
  -are there other ways the company could retaliate against an employee/departing employee who challenges the legality of contract?
  
  I think with termination agreements on being fired there’s often 1. some amount of severance offered 2. a clause that says “the terms and monetary amounts of this agreement are confidential” or similar. I don’t know how often this also includes non-disparagement. I expect that most non-disparagement agreements don’t have a term or limits on what is covered.
  
  I think a steelman of this kind of contract is: Suppose you fire someone, believe you have good reasons to fire them, and you think that them loudly talking about how it was unfair that you fired them would unfairly harm your company’s reputation. Then it seems somewhat reasonable to offer someone money in exchange for “don’t complain about being fired”. The person who was fired can then decide whether talking about it is worth more than the money being offered.
  
  However, you could accomplish this with a much more limited contract, ideally one that lets you disclose “I signed a legal agreement in exchange for money to not complain about being fired”, and doesn’t cover cases where “years later, you decide the company is doing the wrong thing based on public information and want to talk about that publically” or similar.
  
  I think it is not in the nature of most corporate lawyers to think about “is this agreement giving me too much power?” and most employees facing such an agreement just sign it without considering negotiating or challenging the terms.
  
  For any future employer, I will ask about their policies for termination contracts before I join (as this is when you have the most leverage, if they give you an offer they want to convince you to join).
- ChristianKl 30 Jun 2024 9:59 UTC
  17 points
  3
  Parent
  In the case of OpenAI most of the debate was about ex-employees. Are we talking about current employees or ex-employees here?
  - habryka 30 Jun 2024 17:27 UTC
    25 points
    0
    Parent
    I am including both in this reference class (i.e. when I say employee above, it refers to both present employees and employees who left at some point). I am intentionally being broad here to preserve more anonymity of my sources.
- Bird Concept 1 Jul 2024 1:13 UTC
  14 points
  3
  Parent
  Not sure how to interpret the “agree” votes on this comment. If someone is able to share that they agree with the core claim because of object-level evidence, I am interested. (Rather than agreeing with the claim that this state of affairs is “quite sad”.)
- Dagon 30 Jun 2024 21:12 UTC
  7 points
  −7
  Parent
  A LOT depends on the details of WHEN the employees make the agreement, and the specifics of duration and remedy, and the (much harder to know) the apparent willingness to enforce on edge cases.
  “significant financial incentive to promise” is hugely different from “significant financial loss for choosing not to promise”. MANY companies have such things in their contracts, and they’re a condition of employment. And they’re pretty rarely enforced. That’s a pretty significant incentive, but it’s prior to investment, so it’s nowhere near as bad.
- Jacob Pfau 30 Jun 2024 18:45 UTC
  5 points
  1
  Parent
  A pre-existing market on this question https://manifold.markets/causal_agency/does-anthropic-routinely-require-ex?r=SmFjb2JQZmF1
- Zach Stein-Perlman 30 Jun 2024 21:50 UTC
  4 points
  0
  Parent
  What’s your median-guess for the number of times Anthropic has done this?
  - habryka 30 Jun 2024 22:53 UTC
    18 points
    5
    Parent
    (Not answering this question since I think it would leak too many bits on confidential stuff. In general I will be a bit hesitant to answer detailed questions on this, or I might take a long while to think about what to say before I answer, which I recognize is annoying, but I think is the right tradeoff in this situation)
- Zane 1 Jul 2024 6:55 UTC
  3 points
  −12
  Parent
  I’m kind of concerned about the ethics of someone signing a contract and then breaking it to anonymously report what’s going on (if that’s what your private source did). I think there’s value from people being able to trust each others’ promises about keeping secrets, and as much as I’m opposed to Anthropic’s activities, I’d nevertheless like to preserve a norm of not breaking promises.
  Can you confirm or deny whether your private information comes from someone who was under a contract not to give you that private information? (I completely understand if the answer is no.)
  - habryka 1 Jul 2024 7:12 UTC
    14 points
    0
    Parent
    (Not going to answer this question for confidentiality/glommarization reasons)
  - Ben Pace 10 Jul 2024 20:03 UTC
    3 points
    0
    Parent
    I think this is a reasonable question to ask. I will note that in this case, if your guess is right about what happened, the breaking of the agreement is something that it turned out the counterparty endorsed, or at least, after the counterparty became aware of the agreement, they immediately lifted it.
    I still think there’s something to maintaining all agreements regardless of context, but I do genuinely think it matters here if you (accurately) expect the entity you’ve made the secret agreement with would likely retract it if they found out about it.
    (Disclaimer that I have no private info about this specific situation.)
habryka 14 Jan 2025 6:37 UTC
189 points
4
It’s the last 6 hours of the fundraiser and we have met our $2M goal! This was roughly the “we will continue existing and not go bankrupt” threshold, which was the most important one to hit.
Thank you so much to everyone who made it happen. I really did not expect that we would end up being able to raise this much funding without large donations from major philanthropists, and I am extremely grateful to have so much support from such a large community.
Let’s make the last few hours in the fundraiser count, and then me and the Lightcone team will buckle down and make sure all of these donations were worth it.
- davekasten 14 Jan 2025 15:23 UTC
  5 points
  1
  Parent
  Ok, but you should leave the donation box up—link now seems to not work? I bet there would be at least several $K USD of donations from folks who didn’t remember to do it in time.
  - habryka 14 Jan 2025 15:51 UTC
    5 points
    0
    Parent
    Oops, you’re right, fixed. That was just an accident.
    - davekasten 14 Jan 2025 19:44 UTC
      3 points
      0
      Parent
      Note for posterity that there has been at least $15K of donations since this got turned back on—You Can Just Report Bugs
      - habryka 14 Jan 2025 21:09 UTC
        4 points
        0
        Parent
        Those were mostly already in-flight, so not counterfactual (and also the fundraising post still has the donation link at the top), but I do expect at least some effect!
        davekasten 15 Jan 2025 0:22 UTC
        4 points
        2
        Parent
        Oh, fair enough then, I trust your visibility into this. Nonetheless one Should Can Just Report Bugs
- ZY 16 Jan 2025 5:06 UTC
  1 point
  0
  Parent
  Out of curiosity—what was the time span for this raise that achieved this goal/when did first start again? Was it 2 months ago?
  - habryka 16 Jan 2025 5:55 UTC
    3 points
    0
    Parent
    Yep, when the fundraising post went live, i.e. November 29th.
    - kave 16 Jan 2025 16:40 UTC
      4 points
      0
      Parent
      I believe it includes some older donations:
      Our Manifund application’s donations, including donations going back to mid-May, totalling about $50k
      A couple of older individual donations, in October/early Nov, totalling almost 200k
habryka 7 Dec 2024 22:45 UTC
168 points
39
Reputation is lazily evaluated
When evaluating the reputation of your organization, community, or project, many people flock to surveys in which you ask randomly selected people what they think of your thing, or what their attitudes towards your organization, community or project are.
If you do this, you will very reliably get back data that looks like people are indifferent to you and your projects, and your results will probably be dominated by extremely shallow things like “do the words in your name invoke positive or negative associations”.
People largely only form opinions of you or your projects when they have some reason to do that, like trying to figure out whether to buy your product, or join your social movement, or vote for you in an election. You basically never care about what people think about you while engaging in activities completely unrelated to you, you care about what people will do when they have to take any action that is related to your goals. But the former is exactly what you are measuring in attitude surveys.
As an example of this (used here for illustrative purposes, and what caused me to form strong opinions on this, but not intended as the central point of this post): Many leaders in the Effective Altruism community ran various surveys after the collapse of FTX trying to understand what the reputation of “Effective Altruism” is. The results were basically always the same: People mostly didn’t know what EA was, and had vaguely positive associations with the term when asked. The people who had recently become familiar with it (which weren’t that many) did lower their opinions of EA, but the vast majority of people did not (because they mostly didn’t know what it was).
As far as I can tell, these surveys left most EA leaders thinking that the reputational effects of FTX were limited. After all, most people never heard about EA in the context of FTX, and seemed to mostly have positive associations with the term, and the average like or dislike in surveys barely budged. In reflections at the time, conclusions looked like this:
1. The fact that most people don’t really care much about EA is both a blessing and a curse. But either way, it’s a fact of life; and even as we internally try to learn what lessons we can from FTX, we should keep in mind that people outside EA mostly can’t be bothered to pay attention.
2. An incident rate in the single digit percents means that most community builders will have at least one example of someone raising FTX-related concerns—but our guess is that negative brand-related reactions are more likely to come from things like EA’s perceived affiliation with tech or earning to give than FTX.
3. We have some uncertainty about how well these results generalize outside the sample populations. E.g. we have heard claims that people who work in policy were unusually spooked by FTX. That seems plausible to us, though Ben would guess that policy EAs similarly overestimate the extent to which people outside EA care about EA drama.
Or this:
Yes, my best understanding is still that people mostly don’t know what EA is, the small fraction that do mostly have a mildly positive opinion, and that neither of these points were affected much by FTX.^[1]
This, I think, was an extremely costly mistake to make. Since then, practically all metrics of the EA community’s health and growth have sharply declined, and the extremely large and negative reputational effects have become clear.
Most programmers are familiar with the idea of a “lazily evaluated variable”—a value that isn’t computed until the exact moment you try to use it. Instead of calculating the value upfront, the system maintains just enough information to be able to calculate it when needed. If you never end up using that value, you never pay the computational cost of calculating it. Similarly, most people don’t form meaningful opinions about organizations or projects until the moment they need to make a decision that involves that organization. Just as a lazy variable suddenly gets evaluated when you first try to read its value, people’s real opinions about projects don’t materialize until they’re in a position where that opinion matters—like when deciding whether to donate, join, or support the project’s initiatives.
Reputation is lazily evaluated. People conserve their mental energy, time, and social capital by not forming detailed opinions about things until those opinions become relevant to their decisions. When surveys try to force early evaluation of these “lazy” opinions, they get something more like a placeholder value than the actual opinion that would form in a real decision-making context.
This computation is not purely cognitive. As people encounter a product, organization or community that they are considering doing something with, they will ask their friends whether they have any opinions, perform online searches, and generally seek out information to help them with whatever decision they are facing. This is part of the reason for why this metaphorical computation is costly and put off until it’s necessary.
So when you are trying to understand what people think of you, or how people’s opinions of you are changing, pay much more attention to the attitudes of people who have recently put in the effort to learn about you, or were facing some decision related to you, and so are more representative of where people tend to end up at when they are in a similar position. These will be much better indicators of your actual latent reputation than what happens when you ask people on a survey.
For the EA surveys, these indicators looked very bleak:
“Results demonstrated that FTX had decreased satisfaction by 0.5-1 points on a 10-point scale within the EA community”
“Among those aware of EA, attitudes remain positive and actually maybe increased post-FTX —though they were lower (d = −1.5, with large uncertainty) among those who were additionally aware of FTX.”
“Most respondents reported continuing to trust EA organizations, though over 30% said they had substantially lost trust in EA public figures or leadership.”
If various people in EA had paid attention to these, instead of to the approximately meaningless placeholder variables that you get when you ask people what they think of you without actually getting them to perform the costly computation associated with forming an opinion of you, I think they would have made substantially better predictions.
What links here?
- Monthly Roundup #25: December 2024 by Zvi (23 Dec 2024 14:20 UTC; 18 points)
- Buck 8 Dec 2024 17:14 UTC
  21 points
  9
  Parent
  I don’t like the fact that this essay is a mix of an insightful generic argument and a contentious specific empirical claim that I don’t think you support strongly; it feels like the rhetorical strength of the former lends credence to the latter in a way that isn’t very truth-tracking.
  I’m not claiming you did anything wrong here, I just don’t like something about this dynamic.
  - habryka 8 Dec 2024 19:03 UTC
    14 points
    0
    Parent
    I do think the EA example is quite good on an illustrative level. It really strikes me as a rare case where we have an enormous pile of public empirical evidence (which is linked in the post) and it also seems by now really quite clear from a common-sense perspective.
    I don’t think it makes sense to call this point “contentious”. I think it’s about as clear as these cases go. At least of the top of my head I can’t think of an example that would have been clearer (maybe if you had some social movement that more fully collapsed and where you could do a retrospective root cause analysis, but it’s extremely rare to have as clear of a natural experiment as the FTX one). I do think it’s political in our local social environment, and so is harder to talk about, so I agree on that dimension a different example would be better.
    I do think it would be good/nice to add an additional datapoint, but I also think this would risk being misleading. The point about reputation being lazily evaluated is mostly true from common-sense observations and logical reasoning, and the EA point is mostly trying to provide evidence for “yes, this is a real mistake that real people make”. I think even if you dispute EAs reputation having gotten worse, I think the quotes from people above are still invalid and would mislead people (and I had this model before we observed the empirical evidence, and am writing it up because people told me they found it helpful for thinking through the FTX stuff as it was happening).
    I think if I had a lot more time, I think the best thing to do would be to draw on some literature on polling errors or marketing, since the voting situation seems quite analogous. This might even get us some estimates of how strong the correlation between unevaluated and evaluated attitudes are, and how much they diverge for different levels of investment, if there exists any measurable one, and that would be cool.
    - Buck 8 Dec 2024 19:27 UTC
      7 points
      2
      Parent
      I am persuaded by neither the common sense or the empirical evidence for the point about EA. To be clear (as I’ve said to you privately) I’m not at all trying to imply that I specifically disagree with you, I’m just saying that the evidence you’ve provided doesn’t persuade me of your claims.
      - habryka 8 Dec 2024 19:37 UTC
        4 points
        −1
        Parent
        Yeah, makes sense. I don’t think I am providing a full paper trail of evidence one can easily travel along, but I would take bets you would come to agree with it if you did spend the effort to look into it.
- Guive 8 Dec 2024 0:19 UTC
  14 points
  19
  Parent
  This is good. Please consider making it a top level post.
  - metachirality 8 Dec 2024 18:56 UTC
    1 point
    1
    Parent
    It ought to be a top-level post on the EA forum as well.
    - habryka 8 Dec 2024 19:22 UTC
      2 points
      2
      Parent
      (Someone is welcome to link post, but indeed I am somewhat hoping to avoid posting over there as much, as I find it reliably stressful in mostly unproductive ways)
- Seth Herd 9 Dec 2024 15:25 UTC
  12 points
  1
  Parent
  There’s another important effect here: a laggy time course of public opinion. I saw more popular press articles about EA than I ever have, linking SBF to them, but with a large lag after the events. So the early surveys showing a small effect happened before public conversation really bounced around the idea that SBFs crimes were motivated by EA utilitarian logic. The first time many people would remember hearing about EA would be from those later articles and discussions.
  
  The effect probably amplified considerably over time as that hypothesis bounced through public discourse.
  
  The original point stands but this is making the effect look much larger in this case.
  - Hauke Hillebrandt 10 Dec 2024 11:05 UTC
    11 points
    0
    Parent
    This lag effect might amplify a lot more when big budget movies about SBF/FTX come out.
- Zach Stein-Perlman 8 Dec 2024 3:23 UTC
  9 points
  0
  Parent
  Edit 2: after checking, I now believe the data strongly suggest FTX had a large negative effect on EA community metrics. (I still agree with Buck: “I don’t like the fact that this essay is a mix of an insightful generic argument and a contentious specific empirical claim that I don’t think you support strongly; it feels like the rhetorical strength of the former lends credence to the latter in a way that isn’t very truth-tracking.” And I disagree with habryka’s claims that the effect of FTX is obvious.)
  practically all metrics of the EA community’s health and growth have sharply declined, and the extremely large and negative reputational effects have become clear.
  I want more evidence on your claim that FTX had a major effect on EA reputation. Or: why do you believe it?
  Edit: relevant thing habryka said that I didn’t quote above:
  For the EA surveys, these indicators looked very bleak:
  “Results demonstrated that FTX had decreased satisfaction by 0.5-1 points on a 10-point scale within the EA community”
  “Among those aware of EA, attitudes remain positive and actually maybe increased post-FTX —though they were lower (d = −1.5, with large uncertainty) among those who were additionally aware of FTX.”
  “Most respondents reported continuing to trust EA organizations, though over 30% said they had substantially lost trust in EA public figures or leadership.”
  - habryka 8 Dec 2024 8:49 UTC
    18 points
    7
    Parent
    Practically all growth metrics are down (and have indeed turned negative on most measures), a substantial fraction of core contributors are distancing themselves from the EA affiliation, surveys among EA community builders report EA-affiliation as a major recurring obstacle^[1], and many of the leaders who previously thought it wasn’t a big deal now concede that it was/is a huge deal.
    Also, informally, recruiting for things like EA Fund managers, or getting funding for EA Funds has become substantially harder. EA leadership positions appear to be filled by less competent people, and in most conversations I have with various people who have been around for a while, people seem to both express much less personal excitement or interest in identifying or championing anything EA-related, and report the same for most other people.
    Related to the concepts in my essay, when measured the reputational differential also seem to reliably point towards people updating negatively towards EA as they learn more about EA (which shows up in the quotes you mentioned, and which more recently shows up in the latest Pulse survey, though I mostly consider that survey uninformative for roughly the reasons outlined in this post).
    ^
    As reported to me by someone I trust working in the space recently. I don’t have a link at hand.
    - angelinahli 10 Dec 2024 19:40 UTC
      42 points
      2
      Parent
      Hey! Sorry for the silence, I was feeling a bit stressed by this whole thread, and so I wanted to step away and think about this before responding. I’ve decided to revert the dashboard back to its original state & have republished the stale data. I did some quick/light data checks but prioritised getting this out fast. For transparency: I’ve also added stronger context warnings and I took down the form to access our raw data in sheet form but intend to add it back once we’ve fixed the data. It’s still on our stack to Actually Fix this at some point but we’re still figuring out the timing on that.
      On reflection, I think I probably made the wrong call here (although I still feel a bit sad / misunderstood but 🤷🏻‍♀️). It was a unilateral + lightly held call I made in the middle of my work day — like truly I spent 5 min deciding this & maybe another ~15 updating the thing / leaving a comment. I think if I had a better model for what people wanted from the data, I would have made a different call. I’ve updated on “huh, people really care about not deleting data from the internet!” — although I get that the reaction here might be especially strong because it’s about CEA (vs the general case). Sorry, I made a mistake.
      Future facing thoughts: I generally hold myself to a higher standard for accuracy when putting data on the internet, but I also do value not bottlenecking people in investigating questions that feel important to me (e.g. qs about EA growth rates), so to be clear I’m prioritizing the latter goal right now. I still in general stand by, “what even is the point of my job if I don’t stand by the data I communicate to others?” :) I want people to be able to trust that the work they see me put out in the world has been red-teamed & critiqued before publication.
      Although I’m sad this caused an unintended kerfuffle, it’s a positive update for me that “huh wow, people actually care a lot that this project is kept alive!”. This honestly wasn’t obvious to me — this is a low traffic website that I worked on a while ago, and don’t hear about much. Oli says somewhere that he’s seen it linked to “many other times” in the past year, but TBH no one has flagged that to me (I’ve been busy with other projects). I’m still glad that we made this thing in the first place and am glad people find the data interesting / valuable (for general CEA transparency reasons, as an input to these broader questions about EA, etc.). I’ll probably prioritize maintenance on this higher in the future.
      Now that the data is back up I’m going to go back to ignoring this thread!
      What links here?
      angelinahli's comment on Habryka’s Shortform Feed by habryka (10 Dec 2024 19:43 UTC; 6 points)
      - angelinahli 10 Dec 2024 23:13 UTC
        14 points
        3
        Parent
        [musing] Actually another mistake here which I wish I just said in the first comment: I didn’t have a strong enough TAP for, if someone says a negative thing about your org (or something that could be interpreted negatively), you should have a high bar for not taking away data (meaning more broadly than numbers) that they were using to form that perception, even if you think the data is wrong for reasons they’re not tracking. You can like, try and clarify the misconception (ideally, given time & energy constraints etc.), and you can try harder to avoid putting wrong things out there, but don’t just take it away—it’s not on to reader to treat you charitably and it kind of doesn’t matter what your motives were.
        
        I think I mostly agree with something like that / I do think people should hold orgs to high standards here. I didn’t pay enough attention to this and regret it. Sorry! (I’m back to ignoring this thread lol but just felt like sharing a reflection 🤷🏻‍♀️)
      - habryka 10 Dec 2024 19:53 UTC
        4 points
        0
        Parent
        Thank you! I appreciate the quick oops here, and agree it was a mistake (but fixing it as quickly as you did I think basically made up for all the costs, and I greatly appreciate it).
        Just to clarify, I don’t want to make a strong statement that it’s worth updating the data and maintaining the dashboard. By my lights it would be good enough to just have a static snapshot of it forever. The thing that seemed so costly to me was breaking old links and getting rid of data that you did think was correct.
        Thanks again!
    - the gears to ascension 8 Dec 2024 18:40 UTC
      7 points
      0
      Parent
      I suspect fixing this would need to involve creating something new which doesn’t have the structural problems in EA which produced this, and would involve talking to people who are non-sensationalist EA detractors but who are involved with similarly motivated projects. I’d start here and skip past the ones that are arguing “EA good” to find the ones that are “EA bad, because [list of reasons ea principles are good, and implication that EA is bad because it fails at its stated principles]”
      
      I suspect, even without seeking that out, the spirit of EA that made it ever partly good has already and will further metastasize into genpop.
    - angelinahli 9 Dec 2024 1:20 UTC
      5 points
      −21
      Parent
      Hi! A quick note: I created the CEA Dashboard which is the 2nd link you reference. The data here hadn’t been updated since August 2024, and so was quite out of date at the time of your comment. I’ve now taken this dashboard down, since I think it’s overall more confusing than helpful for grokking the state of CEA’s work. We still intend to come back and update it within a few months.
      Just to be clear on why / what’s going on:
      I stopped updating the dashboard in August because I started getting busy with some other projects, and my manager & I decided to deprioritize this. (There are some manual steps needed to keep the data live).
      I’ve now seen several people refer to that dashboard as a reference for how CEA is doing in ways I think are pretty misleading.
      We (CEA) still intend to come back and fix this, and this is a good nudge to prioritize it.
      Thanks!
      - habryka 9 Dec 2024 1:27 UTC
        21 points
        0
        Parent
        Oh, huh, that seems very sad. Why would you do that? Please leave up the data that we have. I think it’s generally bad form to break links that people relied on. The data was accurate as far as I can tell until August 2024, and you linked to it yourself a bunch over the years, don’t just break all of those links.
        I am pretty up-to-date with other EA metrics and I don’t really see how this would be misleading. You had a disclaimer at the top that I think gave all the relevant context. Let people make their own inferences, or add more context, but please don’t just take things down.
        Unfortunately, archive.org doesn’t seem to have worked for that URL, so we can’t even rely on that to show the relevant data trends.
        Edit: I’ll be honest, after thinking about it for longer, the only reason I can think of why you would take down the data is because it makes CEA and EA look less on an upwards trajectory. But this seems so crazy. How can I trust data coming out of CEA if you have a policy of retracting data that doesn’t align with the story you want to tell about CEA and EA? The whole point of sharing raw data is to allow other people to come to their own conclusions. This really seems like such a dumb move from a trust perspective.
        Ben Pace 9 Dec 2024 2:51 UTC
        13 points
        −1
        Parent
        I also believe that the data making EA+CEA looks bad is the causal reason why it was taken down. However, I want to add some slight nuance.
        I want to contrast a model whereby Angelina Li did this while explicitly trying to stop CEA from looking bad, versus a model whereby she senses that something bad might be happening, she might be held responsible (e.g. within her organization / community), and is executing a move that she’s learned is ‘responsible’ from the culture around her.
        I think many people have learned to believe the reasoning step “If people believe bad things about my team I think are mistaken with the information I’ve given them, then I am responsible for not misinforming people, so I should take the information away, because it is irresponsible to cause people to have false beliefs”. I think many well-intentioned people will say something like this, and that this is probably because of two reasons (borrowing from The Gervais Principle):
        This is a useful argument for powerful sociopaths to use when they are trying to suppress negative information about themselves.
        The clueless people below them in the hierarchy need to rationalize why they are following the orders of the sociopaths to prevent people from accessing information. The idea that they are ‘acting responsibly’ is much more palatable than the idea that they are trying to control people, so they willingly spread it and act in accordance with it.
        A broader model I have is that there are many such inference-steps floating around the culture that well-intentioned people can accept as received wisdom, and they got there because sociopaths needed a cover for their bad behavior and the clueless people wanted reasons to feel good about their behavior; and that each of these adversarially optimized inference-steps need to be fought and destroyed.
        sarahconstantin 9 Dec 2024 20:37 UTC
        29 points
        12
        Parent
        I agree, and I am a bit disturbed that it needs to be said.
        At normal, non-EA organizations—and not only particularly villainous ones, either! -- it is understood that you need to avoid sharing any information that reflects poorly on the organization, unless it’s required by law or contract or something. The purpose of public-facing communications is to burnish the org’s reputation. This is so obvious that they do not actually spell it out to employees.
        Of COURSE any organization that has recently taken down unflattering information is doing it to maintain its reputation.
        I’m sorry, but this is how “our people” get taken for a ride. Be more cynical, including about people you like.
        Kaj_Sotala 10 Dec 2024 12:02 UTC
        11 points
        9
        Parent
        I think many people have learned to believe the reasoning step “If people believe bad things about my team I think are mistaken with the information I’ve given them, then I am responsible for not misinforming people, so I should take the information away, because it is irresponsible to cause people to have false beliefs”. I think many well-intentioned people will say something like this, and that this is probably because of two reasons (borrowing from The Gervais Principle):
        (Comment not specific to the particulars of this issue but noted as a general policy:) I think that as a general rule, if you are hypothesizing reasons for why somebody might say a thing, you should always also include the hypothesis that “people say a thing because they actually believe in it”. This is especially so if you are hypothesizing bad reasons for why people might say it.
        It’s very annoying when someone hypothesizes various psychological reasons for your behavior and beliefs but never even considers as a possibility the idea that maybe you might have good reasons to believe in it. Compare e.g. “rationalists seem to believe that superintelligence is imminent; I think this is probably because that lets them avoid taking responsibility about their current problems if AI will make those irrelevant anyway, or possibly because they come from religious backgrounds and can’t get over their subconscious longing for a god-like figure”.
        Ben Pace 10 Dec 2024 17:44 UTC
        4 points
        0
        Parent
        I feel more responsibility to be the person holding/tracking the earnest hypothesis in a 1-1 context, or if I am the only one speaking; in larger group contexts I tend to mostly ask “Is there a hypothesis here that isn’t or likely won’t be tracked unless I speak up” and then I mostly focus on adding hypotheses to track (or adding evidence that nobody else is adding).
        habryka 10 Dec 2024 17:45 UTC
        2 points
        0
        Parent
        (Did Ben indicate he didn’t consider it? My guess is he considered it, but thinks it’s not that likely and doesn’t have amazingly interesting things to say on it.
        
        I think having a norm of explicitly saying “I considered whether you were saying the truth but I don’t believe it” seems like an OK norm, but not obviously a great one. In this case Ben also responded to a comment of mine which already said this, and so I really don’t see a reason for repeating it.)
        Kaj_Sotala 17 Dec 2024 20:02 UTC
        2 points
        0
        Parent
        (I read
        I think many well-intentioned people will say something like this, and that this is probably because of two reasons
        as implying that the list of reasons is considered to exhaustive, such that any reasons besides those two have negligible probability.)
        Ben Pace 18 Dec 2024 7:47 UTC
        2 points
        0
        Parent
        I gave my strongest hypothesis for why it looks to me that many many people believe it’s responsible to take down information that makes your org look bad. I don’t think alternative stories have negligible probability, nor does what I wrote imply that, though it is logically consistent with that.
        There are many anti-informative behaviors that are widespread for which people do for poor reasons, like saying that their spouse is the best spouse in the world, or telling customers that their business is the best business in the industry, or saying exclusively glowing things about people in reference letters, that are best explained by the incentives on the person to present themselves in the best light; at the same time, it is respectful to a person, while in dialogue with them, to keep a track of the version of them who is trying their best to have true beliefs and honestly inform others around them, in order to help them become that person (and notice the delta between their current behavior and what they hopefully aspire to).
        Seeing orgs in the self-identified-EA space take down information that makes them look bad is (to me) not that dissimilar to the other things I listed.
        I think it’s good to discuss norms about how appropriate it is to bring up cynical hypotheses about someone during a discussion in which they’re present. In this case I think raising this hypothesis was worthwhile it for the discussion, and I didn’t cut off any way for the person in question to continue to show themselves to be broadly acting in good faith, so I think it went fine. Li replied to Habryka, and left a thoughtful pair of comments retracting and apologizing, which reflected well on them in my eyes.
        Kaj_Sotala 18 Dec 2024 8:13 UTC
        2 points
        0
        Parent
        
        I don’t think alternative stories have negligible probability
        
        Okay! Good clarification.
        
        I think it’s good to discuss norms about how appropriate it is to bring up cynical hypotheses about someone during a discussion in which they’re present.
        
        To clarify, my comment wasn’t specific to the case where the person is present. There are obvious reasons why the consideration should get extra weight when the person is present, but there’s also a reason to give it extra weight if none of the people discussed are present—namely that they won’t be able to correct any incorrect claims if they’re not around.
        
        so I think it went fine
        
        Agree.
        
        (As I mentioned in the original comment, the point I made was not specific to the details of this case, but noted as a general policy. But yes, in this specific case it went fine.)
        angelinahli 10 Dec 2024 19:43 UTC
        6 points
        0
        Parent
        More thoughts here, but TL;DR I’ve decided to revert the dashboard back to its original state & have republished the stale data. (Just flagging for readers who wanted to dig into the metrics.)
        angelinahli 9 Dec 2024 1:58 UTC
        6 points
        1
        Parent
        Quick thoughts on this:
        “The data was accurate as far as I can tell until August 2024”
        I’ve heard a few reports over the last few weeks that made me unsure whether the pre-Aug data was actually correct. I haven’t had time to dig into this.
        In one case (e.g. with the EA.org data) we have a known problem with the historical data that I haven’t had time to fix, that probably means the reported downward trend in views is misleading. Again I haven’t had time to scope the magnitude of this etc.
        I’m going to check internally to see if we can just get this back up in a week or two (It was already high on our stack, so this just nudges up timelines a bit). I will update this thread once I have a plan to share.
        I’m probably going to drop responding to “was this a bad call” and prioritize “just get the dashboard back up soon”.
        angelinahli 9 Dec 2024 3:01 UTC
        3 points
        −3
        Parent
        Hey! I just saw your edited text and wanted to jot down a response:
        Edit: I’ll be honest, after thinking about it for longer, the only reason I can think of why you would take down the data is because it makes CEA and EA look less on an upwards trajectory. But this seems so crazy. How can I trust data coming out of CEA if you have a policy of retracting data that doesn’t align with the story you want to tell about CEA and EA? The whole point of sharing raw data is to allow other people to come to their own conclusions. This really seems like such a dumb move from a trust perspective.
        I’m sorry this feels bad to you. I care about being truth seeking and care about the empirical question of “what’s happening with EA growth?”. Part of my motivation in getting this dashboard published in the first place was to contribute to the epistemic commons on this question.
        I also disagree that CEA retracts data that doesn’t align with “the right story on growth”. E.g. here’s a post I wrote in mid 2023 where the bottom line conclusion was that growth in meta EA projects was down in 2023 v 2022. It also publishes data on several cases where CEA programs grew slower in 2023 or shrank. TBH I also think of this as CEA contributing to the epistemic commons here — it took us a long time to coordinate and then get permission from people to publish this. And I’m glad we did it!
        On the specific call here, I’m not really sure what else to tell you re: my motivations other than what I’ve already said. I’m going to commit to not responding further to protect my attention, but I thought I’d respond at least once :)
        habryka 9 Dec 2024 3:42 UTC
        6 points
        4
        Parent
        I would currently be quite surprised if you had taken the same action if I was instead making an inference that positively reflects on CEA or EA. I might of course be wrong, but you did do it right after I wrote something critical of EA and CEA, and did not do it the many other times it was linked in the past year. Sadly your institution has a long history of being pretty shady with data and public comms this way, and so my priors are not very positively inclined.
        I continue to think that it would make sense to at least leave the data up that CEA did feel comfortable linking in the last 1.5 years. By my norms invalidating links like this, especially if the underlying page happens to be unscrapeable by the internet archive, is really very bad form.
        I did really appreciate your mid 2023 post!
- yanni kyriacos 11 Dec 2024 0:06 UTC
  1 point
  0
  Parent
  I spent 8 years working in strategy departments for Ad Agencies. If you’re interested in the science behind brand tracking, I recommend you check out the Ehrenberg-Bass Institutes work on Category Entry Points: https://marketingscience.info/research-services/identifying-and-prioritising-category-entry-points/
habryka 26 Sep 2025 23:45 UTC
166 points
0
In addition to Lighthaven for which we have a mortgage, Lightcone owns an adjacent property that is fully unencumbered that’s worth around $1.2M. Lighthaven has basically been breaking even, but we still have a funding shortfall of about $1M for our annual interest payment for the last year during which Lighthaven was ramping up utilization. It would be really great if we could somehow take out our real estate equity to cover that one-time funding shortfall.
If you want to have some equity in Berkeley real estate, and/or Lightcone’s credit-worthiness, you might want to give Lightcone a loan secured against our $1.2M property. We would pay normal market interest rates on this (~6% at the moment), and if we ever default, you would get the property.
We have some very mediocre offers from banks for a mortgage like this (interest rates of around 11% and only cashing out like $600k on the property). Banks really don’t like lending to nonprofits, who tend to have kind of unstable income streams. I think there is a quite decent chance that it would make more economic sense for someone who has more reason to think that we won’t be a giant pain to collect from to do this instead (given that from the perspective of the bank we are hard to distinguish from other nonprofits, but we are easy to distinguish from the perspective of most readers of this).
To be clear, by my lights most lenders are probably better served making some AI-related investments, which I expect will have higher risk-adjusted returns, but this could be a good bet as part of a portfolio, or for someone who doesn’t want to make AI-related bets for ethical reasons.
If you’re interested, or know anyone who might, feel free to DM me, or comment here, or send me an email at habryka@lesswrong.com.
habryka 23 Jan 2026 2:27 UTC
145 points
14
I’ve been thinking for a while about what happens in the U.S. if the sitting president does a bunch of crazy stuff that is kind of clearly unconstitutional, or interferes with the legitimate democratic process, and this becomes clear to other parts of government.
At a high level, when one of the three branches of government (the executive, the legislative and the judicial branch) in the U.S. starts going off the rails, the other two both have some tools to stop the crazy branch. For now I think it makes sense to focus on what tools the judicial branch has, whose highest authority is the Supreme Court.
Let’s say the supreme court wants to stop a sitting president from destroying democracy in America. First, they release a judgement saying that something the executive branch is doing is unconstitutional. Hopefully the U.S. president agrees and then just stops doing that. But what happens when the executive branch keeps doing it anyways?
All federal officers (which are all part of the executive branch and approximately all under the direct command of the president) swear an oath to “support and defend the Constitution of the United States against all enemies, foreign and domestic”. This generally means that if an officer gets an order to do something unconstitutional, they are supposed to refuse that order (and this seems at least somewhat culturally real and not just a formality).
Now, how does an officer know whether an order they receive is unconstitutional? Historically matters of interpretation of the U.S. constitution have largely been delegated to the supreme court. However, this is not an ironclad rule or something the constitution itself specifies! The constitution does not say who has ultimate authority about its interpretation. In-practice most federal offices have deferred to what the Supreme Court says, but we haven’t really seen what happens when e.g. a sitting president insists on an interpretation of the constitution that disagrees, and the constitution itself provides no clear answer to what is supposed to happen.
So, with this background knowledge, I see roughly 4 big ways the supreme court can try to rein in an out of control executive branch that isn’t listening to a judgement they made:
1. Declare injunctions against specific federal officers
The first thing the Supreme Court is likely to try (likely before declaring an action by the U.S. president illegal/unconstitutional), is to require mandatory injunctions from specific federal officers to stop or provide relief to someone harmed.
This usually puts the officers into a very tricky position. If the president insists on ignoring the injunction, and the officer goes along with the president, the officer faces the risk of the court telling other parties to assist in prosecution of the relevant injunction (like the court asking banks to freeze their bank accounts, who might choose to side with the supreme court over the president, or ordering other parts of the executive to potentially jail or imprison them). Many of those parties are civilian institutions and so might decide to cooperate even under threats from the executive to not do so (and they would have decent legal standing to do so).
Courts can issue injunctions as part of criminal contempt (punishment for harming the functioning of the courts) or civil contempt (restitution to the other party that is being harmed). We are assuming an uncooperative president, and due to the president’s pardon power the criminal contempt is unlikely to be an issue since those can just be pardoned away. The civil contempt however, which would include things like private assets being seized cannot be pardoned away, and so provides a decent incentive for individual officers to at least refuse to execute any illegal orders, if not to go along with the court’s injunction against the president’s orders.
But if the officer refuses, the president is usually just able to fire whoever refuses to obey their orders, and in most circumstances can appoint a replacement (or just give orders directly to lower-level employees). This means in order for this to be effective, there needs to be relatively widespread buy-in for many federal officers to refuse at the same time, such that replacement on realistic timelines becomes infeasible.
2. Send in the Supreme Court Marshal, hope that no one stops them
Turns out, the supreme court has guns! And they are allowed to use them! These guns come in the form of the Marshal of the Supreme Court who is under the direct control of the judicial branch and directs a small (~200 person) police force. He is allowed to make arrests and generally enforce the court’s judgements. This would (as far as I know) include authority to jail the sitting president or other high-level federal officers.^[1]
Unfortunately, from the perspective of the supreme court, these are really not very many guns. This basically means that in order for any order to successfully get enforced, approximately all federal law enforcement (and military officers domestically deployed) would need to refuse the orders they would surely receive by the sitting president to prevent the marshals from jailing them or any other high-level officials in the federal government.
This might happen! If enough federal officers do decide to defer to the Supreme Court, and to take their oaths to the constitution seriously, then it is not implausible to imagine that they would let the marshal do their job.
3. Hope that declaring the current president to be violating the constitution causes Congress to impeach
According to the constitution it’s congress’s job to determine whether the sitting president needs to be removed because they are violating the constitution. So one would hope that the supreme court taking a pretty clear stance here would increase the likelihood of congress moving to impeach the sitting president.
Of course, even if they want to do that, a crucial question becomes what tools the president has to prevent congress from impeaching them. I haven’t looked into it enough to have really any idea how this situation plays out.
4. Call for the states to do something about the executive branch
The other big player in the balance of power of the United States are the state governments. My current best understanding is that the states don’t really have any authority to interfere with what the federal government wants to do, but that hasn’t stopped the states in the past. A supreme court judgement might very well catalyze actions by e.g. the state to use state police forces or state-aligned parts of the national guard to prevent federal officers from taking actions judged unconstitutional by the supreme court.
If this kind of thing happens, I think a lot of it ends up coming down to what the U.S. military does. My current model is that due to the Insurrection Act the U.S. president basically can just deploy the military domestically whenever he wants, and this seems unlikely to be disputed, so anything that would approach substantial violent conflict would probably be met with opposition by the full power of the U.S. military, which are quite solidly under the direct command of the president (of course, possibly enough military personnel would refuse orders to make such action not decisive, but at least from a constitutional perspective no one but the president seems authorized to order the military to do anything proactively, e.g. there is no constitutional way for the military to end up supporting the states in conflict against the federal government).
So where does that leave things overall? Overall, when I researched this, I made a bunch of updates that from a constitutionalist perspective, the supreme court does not really have much of any tools to rein in an out of control executive branch, which on the margin seems pretty bad. I was hoping there were more clear guidelines about what to do if there is disagreement between the executive branch and the supreme court on the interpretation of the constitution. I was also hoping there were bigger barriers to the domestic deployment of the U.S. military by the sitting president.
The biggest thing that my curiosity goes towards when understanding the dynamics here is knowing what various high-level military officials would do when faced with the supreme court declaring actions of the executive branch unconstitutional. They are ultimately the people with the guns, and have sworn an oath to the constitution, and understanding how seriously they would take supreme court making a clear judgement (and e.g. would be open to protecting U.S. marshalls while they enforce supreme court judgement) seems like one of the most crucial questions.
1. ^
  In any realistic scenario, before the Supreme Court would order the Supreme Court Marshal, they would first try to order the confusingly named U.S. Marshals, who are usually responsible for enforcing court orders and things like that. However, those are under the direct command of the executive and the DOJ, and are more likely than not to refuse orders by the courts without executive buy-in, and in this scenario we are assuming non-cooperation of the executive.
- Garrett Baker 23 Jan 2026 17:56 UTC
  54 points
  4
  Parent
  
  These guns come in the form of the U.S. Marshalls who are under the direct control of the judicial branch.
  
  You got the (admittedly extremely confusing) names wrong here. The US Marshalls are under the executive branch and report to the Attorney General, however the Marshal of the United States Supreme Court is a single person under the direct command of the supreme court and heads the Supreme Court of the United States Police Department, who are actually the people with most of the guns here.
  
  This seems like its caused some confusion with some commenters here.
  What links here?
  - Garrett Baker's comment on Habryka’s Shortform Feed by habryka (23 Jan 2026 18:00 UTC; 5 points)
  - habryka 23 Jan 2026 20:54 UTC
    14 points
    0
    Parent
    Hmm, yeah, I think I did get confused here! For people who want to learn more about the details of the authority of the different Marshalls, I liked this: https://www.congress.gov/crs-product/LSB11271
    Enforcement of Court Orders Against the Executive Branch
    Months into the second Trump Administration, a number of executive branch policies have been challenged in court, and several federal district courts have enjoined enforcement of some of the challenged policies. As one example, on January 31, 2025, a judge on the U.S. District Court for the District of Rhode Island issued a temporary restraining order (TRO) barring the Trump Administration from enforcing a federal funding freeze with respect to a number of states that had challenged the freeze. On February 10, 2025, after the states alleged that the government was not complying with the TRO, the court granted a motion for enforcement of the TRO requiring the government, among other things, to “immediately end any federal funding pause during the pendency of the TRO.”
    [...]
    
    When a federal court imposes contempt sanctions, the U.S. Marshals Service enforces the order, including by arresting persons ordered imprisoned for contempt. The U.S. Marshals Service is an executive branch agency within the Department of Justice. Some commentators have expressed concerns that, if the executive branch chose to defy a court order, it might also seek to prevent the U.S. Marshals from enforcing contempt sanctions. The U.S. Marshals are required by statute to “execute all lawful writs, process, and orders issued under the authority of the United States.” The 2018 review of contempt against the federal government notes that, historically, Presidents have complied with federal court orders and have not directed the U.S. Marshals not to enforce contempt orders. The President’s pardon power applies to criminal contempt but does not apply to civil contempt sanctions.
    In theory, the whole process from injunction to contempt to sanctions might proceed exclusively in a district court. In practice, however, it is likely that one or more appellate courts would also be involved. A court order fining or imprisoning a person held in civil contempt generally may not be appealed until the court enters a final judgment. However, a district court order granting injunctive relief is usually immediately appealable to the appropriate federal appellate court, and rulings of the appeals courts related to injunctive relief may immediately be challenged via a petition for a writ of certiorari to the Supreme Court (though the Court has discretion whether to consider such matters). A conviction for criminal contempt is immediately appealable.
    I might edit the post to account for my confusions.
  - AlphaAndOmega 23 Jan 2026 23:00 UTC
    13 points
    0
    Parent
    https://www.scuspd.gov/department/
    The Supreme Court of the United States Police have allocated staffing for 198 officers, who currently represent 24 States, and 8 Countries.
    That isn’t a lot of men (or women) with guns.
    - Garrett Baker 24 Jan 2026 5:05 UTC
      6 points
      2
      Parent
      Yeah I meant “most” where the others we’re comparing are the 9 justices and one Marshal at most.
- Alexander Gietelink Oldenziel 23 Jan 2026 14:37 UTC
  18 points
  4
  Parent
  One takeaway for me is that the american Presidency is extremely powerful—especially when you don’t care about passing legislation or popularity.
  The unlimited pardons and vetoes is something that has been only sporadically used in the past, limited mostly by convention. Just reading the constitution text-as-written the presidency is wildly powerful, especially with a supreme court following a unitary executive interpretation and a lame-duck congress that does not care to insist on its war declaration prerogative.
  I’m amused that the lightcone may have been lost in the 1790′s when the US constitutional framework was designed.
  - [ ]
    [deleted]
- DAL 23 Jan 2026 16:42 UTC
  17 points
  3
  Parent
  At the end of the day, the rule of a law is a Tinker Bell situation (it only survives if we believe in it). Long-term constitutional stability under a presidential system of government is also quite exceptional. The standard argument is that the US is the only successful case of long-run constitutional stability under a presidential argument (though, depending on how you define long-run, you might throw in Costa Rica today). We’re very lucky that we’ve believed for so long.
  I’d add a couple more factors into your analysis, though.
  One thing you leave out is mass public opinion, and all the various ways that can be effective—demonstrators in the streets, general strike, cessation of quasi-voluntary compliance in all the areas where the government requires it, and so on, perhaps insurgency or terrorism in extremis. Layer onto that the various additional actions available to economic elites. The real hope for the Supreme Court is that the public takes its side in some extreme crisis, and that a clear ruling on its part serves as the focal point to kick all of that off.
  It’s pretty unlikely that the US military would be willing to crack down in that scenario. But even if it were, it doesn’t have the capacity to operate a police state. Most of our military capabilities aren’t geared towards that (something like a B-2 bomber or an aircraft carrier just really isn’t so helpful) -- the infantry forces of the US military aren’t even numerous enough to take over for the existing cadre of state and local police (assuming they walk off in this scenario) much less to do some kind of large-scale repression on top of maintaining ordinary law and order.
  Another factor (in less extreme scenarios) is that the courts, in their ordinary and apolitical capacity, are extremely valuable to the government. A collision that ends up destroying the courts takes a lot of the economy with it because large chunks of the economy are underpinned by the existence of a rule of law system governing economic transactions. And the courts are also necessary to keep the trains running on criminal justice and so on. A surgical attack on the courts that disables them only on the political issues while keeping all of that running is very difficult to mount, especially in the face of their concerted resistance. Even autocrats find it useful to have a functional court system (and our own legal tradition emerged as a tool of the British monarchy)
  On a lot of lower stakes stuff, this is really what matters, especially when the actions the government wants to take flow directly through the courts. There are a lot of issues in terms of criminal procedure where the executive would have pretty wide public support for violating the constitution (e.g., in surveys, a substantial majority of Americans favor rolling back various rights constitutionally granted to criminal defendants). Similarly, a pretty sizable chunk of Americans on either side seem to actively favor imprisoning political opponents on trumped up charges and no one is going to take to the streets if it happens. But, because the criminal process runs straight through the courts, you can’t really get those things done without blowing up the system. And that’s a big step to take.
  Another consideration here is the power the courts have over lawyers. So long as the executive branch is still playing the game with reference to the rules (however fast and loose it’s being with those rules), the lawyers advancing its positions are subject to judicial discipline and, therefore, face personal consequences like disbarment. If the executive has decided to go all-out, that stops mattering. But in a lesser constitutional crisis, those people are still thinking about those interests and that exerts a lot of pressure in the rule of law direction. Likewise for the willingness of the courts to continue extending the government the presumption of regularity.
  Circling back on the military, a couple of points:
  1. A military willing and able to rule through force usually wants to do so on its own behalf. What does it need the unpopular civilian dictator for? So, it’s a pretty hard ask to make even if the military is not committed to any underlying values that preclude it unless there’s a really deep loyalty to the leader.
  2. Getting your military to fire on civilians is really hard, especially in a military culture like ours.
  3. The constitutional/rule of law/democratic norms in the US military are all pretty strong culturally. So is a norm against involvement in partisan or domestic issues (that don’t pertain directly to the military itself).
  4. A particularly crucial constituency within the military in such hypothetical is the JAG Corps (the military’s own internal lawyers). JAGs are very integrated into decision-making and have managed over the last few decades to achieve very high status within the military. [a somewhat troubling aside is that the Trump administration purged the JAG leadership shortly after taking office]. In general, American lawyers inclusive of JAGs are especially committed to rule of law and things like following court orders.
  As a closing though, the scenarios that worry me the most don’t involve outright defiance and clashes. The smart way of doing things is a little more subtle (and in the current moment also leverages the fact that the Supreme Court is willing to give the administration considerable benefit of the doubt). The Supreme Court’s own precedents have also handicapped it in that it has declared a variety of the legal tools you’d want in a crisis to be beyond its own powers and invented a lot of technicalities for the president to play to his own advantage.
  - habryka 23 Jan 2026 21:42 UTC
    3 points
    1
    Parent
    Appreciate the factors! Agree on most of them being quite important. One quick note:
    One thing you leave out is mass public opinion, and all the various ways that can be effective—demonstrators in the streets, general strike, cessation of quasi-voluntary compliance in all the areas where the government requires it, and so on, perhaps insurgency or terrorism in extremis. Layer onto that the various additional actions available to economic elites. The real hope for the Supreme Court is that the public takes its side in some extreme crisis, and that a clear ruling on its part serves as the focal point to kick all of that off.
    Yeah, my analysis here was focused on what the supreme court and judiciary can do, from a constitutionalist perspective. My sense is the constitution doesn’t really allow insurrection under almost any circumstance, but does also maybe kind of expect it’s an important thing to maintain the threat of (hence the right to bear arms). I would be interested in someone analyzing when the constitution would permit a private citizen to take up arms against a sitting government (if any such circumstance exists).
    - Garrett Baker 24 Jan 2026 6:10 UTC
      2 points
      0
      Parent
      
      I would be interested in someone analyzing when the constitution would permit a private citizen to take up arms against a sitting government (if any such circumstance exists).
      
      To my knowledge, the interpretation which comes closest is Insurrectionist theory which interprets the right to bear arms as including the right of citizens to use them to defend against an oppressive government. There are apparently more explicit statements of this right in the preambles to some first-state constitutions, as well as the declaration of independence.
      
      It should not be surprising that nobody has yet won on such a case in court though, and practically speaking you don’t have this right ^[1] .
      
      My understanding has been that even if you are arrested unlawfully by a police officer, you can’t use proportional force (as you would if you were assaulted by a non-police-officer), since the perspective of the government is that it is the judiciary’s right to determine whether an arrest is or isn’t lawful, not the citizen’s.
      
      ↩︎
      Except implicitly the founders themselves, who of course supported the right to revolution. Or at least supported that right for themselves. But originalism has never been a popular (or coherent) constitutional philosophy.
  - Arjun Panickssery 26 Jan 2026 3:38 UTC
    2 points
    0
    Parent
    At the end of the day, the rule of a law is a Tinker Bell situation (it only survives if we believe in it). Long-term constitutional stability under a presidential system of government is also quite exceptional. The standard argument is that the US is the only successful case of long-run constitutional stability under a presidential argument (though, depending on how you define long-run, you might throw in Costa Rica today). We’re very lucky that we’ve believed for so long.
    Can you explain your thinking here more and how it connects to the idea of constitutional risk?
    The U.S. president holds a weaker office than the heads of government in most other countries. The Canadian and British PMs and the French presidents definitely seem stronger; the German Chancellor seems weaker, and maybe the Israeli and Italian and Japanese PMs? (These aren’t strong views). I most often hear from proponents of the parliamentary system that it is less gridlocked and more powerful/effective rather than less.
    - DAL 26 Jan 2026 23:06 UTC
      14 points
      9
      Parent
      The U.S. president holds a weaker office than the heads of government in most other countries. The Canadian and British PMs and the French presidents definitely seem stronger
      It matters exactly what you’re comparing here.
      An American president is typically less effectual than a British PM, but the office is stronger. That is, the PM receives basically no power qua PM whereas the American presidency directly comes with considerable constitutional power.
      If you were randomly dropped in by some process as the US president tomorrow, you’d immediately be a very powerful person and you’d hold those powers for a considerable length of time. If you were randomly dropped in as British PM, you’d be removed in a confidence vote in an instant.
      The PM in a parliamentary system can typically get a great deal more done than the US president but that’s a selection effect really—being the PM means you also commanded a Parliamentary majority in order to get there, so of course you face less gridlock. The legislative branch doesn’t typically want to stop you. But, if the legislature suddenly does want to stop you, you’re gone immediately.
      Can you explain your thinking here more and how it connects to the idea of constitutional risk?
      Suppose the executive wants to seize power. If the legislature supports that, then it’s going to be a relatively easy thing to do in either a presidential or a parliamentary system. Whatever constraint there has to come from somewhere else.
      The distinction between the two systems really only matters if the legislature opposes the seizure. Under a parliamentary system, they have an easy remedy—trigger a no-confidence vote and get rid of the problematic leader. Under a presidential system? Removing the leader is hard, and if you get into some kind of fight otherwise the president has all kinds of levers to pull. Which turns nasty (and those kind of moments of conflict also potentially create an opening for the military or someone else to seize power). It’s clearly better to be in a parliamentary system in that situation.
      I was also referencing above the classic essay “The Perils of Presidentialism” by Juan Linz, which lays out a much more sophisticated set of arguments.
      - Arjun Panickssery 27 Jan 2026 7:14 UTC
        2 points
        −4
        Parent
        The distinction between the two systems really only matters if the legislature opposes the seizure.
        With this, you focus too narrowly on this specific minority-rule “seizure of power” scenario rather than the relative power of the offices more generally.
        There are more differences than you mention. The PM is less hindered by the independent judiciary than the president. The PM in a Westminster system also exerts greater control over the individual legislators via his party than in the American system. The PM can serve for an unlimited time, and call elections at strategic moments, while Trump is limited to two terms. All these things increase the power of the PM and the risk of oppressive rule in Westminster-style parliamentary systems.
        What links here?
        Arjun Panickssery's comment on Habryka’s Shortform Feed by habryka (27 Jan 2026 7:18 UTC; 2 points)
        DAL 27 Jan 2026 14:52 UTC
        1 point
        0
        Parent
        The PM is less hindered by the independent judiciary than the president. The PM in a Westminster system also exerts greater control over the individual legislators via his party than in the American system. The PM can serve for an unlimited time, and call elections at strategic moments, while Trump is limited to two terms. All these things increase the power of the PM and the risk of oppressive rule in Westminster-style parliamentary systems.
        None of those are inherently features of a parliamentary (or even Westminster-style) government. Those are all separate institutional choices you can make in either setup.
        With this, you focus too narrowly on this specific minority-rule “seizure of power” scenario rather than the relative power of the offices more generally.
        Sorry, I thought we were discussing the possibility of collapse into authoritarianism, in which case some kind of seizure of power is the relevant question? The claim I was making above is relevant to this, and not to other bad things that might happen.
        As to the “power of the offices,” I do want to re-emphasize what I said earlier which is that you have to make a separation between the powers of the office (i.e., those vested in the office itself) and the typical powers of the officeholder (i.e., additional power that is typically held by the person holding the office but not as a consequence of holding the office). Much of the power of the typical prime minister flows from the fact that they are also the leader of a legislative majority. The matched comparison would be some kind of situation where the American president is also the speaker of the house and the Senate has been reduced to a ceremonial role (and if you want to match Britain in particular to the US, you also have to match other unrelated features like federalism and the strength of judicial review).
        Arjun Panickssery 28 Jan 2026 3:49 UTC
        2 points
        0
        Parent
        Maybe one distinction here is that you mention this question: Under which office can a random maniac who somehow ends up in that position cause more chaos or seize power?
        But there is another question: Which office in practice results in more powerful officeholders, holding the population itself constant?
    - dsj 26 Jan 2026 11:34 UTC
      1 point
      0
      Parent
      The U.S. president holds a weaker office than the heads of government in most other countries. The Canadian and British PMs and the French presidents definitely seem stronger; the German Chancellor seems weaker, and maybe the Israeli and Italian and Japanese PMs? (These aren’t strong views). I most often hear from proponents of the parliamentary system that it is less gridlocked and more powerful/effective rather than less.
      It is less gridlocked, but that’s because the PM works for parliament and serves at its pleasure, much as a CEO for a board of directors. The PM normally can be removed by simple majority vote of no confidence at any time. While somewhat infrequent, this occurs often enough — and is a plausible enough threat even when it does not occur — that it cannot really be called exceptional in the way that the successful removal of a president via impeachment would be (which in the US is structurally very burdensome: demanding actual wrongdoing — “high crimes and misdemeanors” — rather than a mere loss of confidence, a majority in the House, an entire trial, and then a two-thirds majority in the Senate, and we have seen how difficult this bar is to meet even for extraordinarily unusual behavior). Furthermore, the PM has no formal say in legislation, which is another reason for less gridlock (though typically, as the head of their party, they do have great influence, but again, only so long as they can maintain a governing coalition within parliament).
      It is precisely because of the gridlock created by a presidential system, with its “checks and balances”, that over time more power tends to be arrogated to the president in order to “get things done” that aren’t getting done otherwise, often without the political will to stand in the way of such arrogation when it occurs.
      In the US specifically, another way in which the president has recently gained tremendous power stems from these “checks and balances”: the Supreme Court has opined that if presidential acts were subject to regular law, then this would give Congress the power to limit Article II presidential power.^[1] This kind of consideration is normally not at issue in a parliamentary system, and thus the PM is normally subject to criminal law.
      ^
      This basic logic seems very defensible to me, although they seem to have extended the notion of “official [presidential] acts” substantially beyond anything explicit in the Constitution, and then gone even further, to preclude not only prosecution for such acts, but even judicial consideration of such acts as evidence in a prosecution for non-official acts, under the theory that allowing such evidence would have a chilling effect on the president’s freedom to act within constitutional limits. However, this is very different from how we treat speech: we don’t say that a tweet is inadmissible in court as evidence for a non-speech crime, even though the tweet itself may be constitutionally protected speech which must not be chilled.
      - Arjun Panickssery 27 Jan 2026 7:18 UTC
        2 points
        0
        Parent
        The PM normally can be removed by simple majority vote of no confidence at any time. While somewhat infrequent, this occurs often enough — and is a plausible enough threat even when it does not occur — that it cannot really be called exceptional in the way that the successful removal of a president via impeachment would be
        This isn’t because the president can’t pass legislation on his own, so without the support of Congress he’s a lame duck even without removal. And you ignore other elements:
        There are more differences than you mention. The PM is less hindered by the independent judiciary than the president. The PM in a Westminster system also exerts greater control over the individual legislators via his party than in the American system. The PM can serve for an unlimited time, and call elections at strategic moments, while Trump is limited to two terms. All these things increase the power of the PM and the risk of oppressive rule in Westminster-style parliamentary systems.
        It is precisely because of the gridlock created by a presidential system, with its “checks and balances”, that over time more power tends to be arrogated to the president in order to “get things done” that aren’t getting done otherwise, often without the political will to stand in the way of such arrogation when it occurs.
        This is a recent historical trend and not a defining feature of the system itself.
        dsj 27 Jan 2026 21:30 UTC
        1 point
        0
        Parent
        This isn’t because the president can’t pass legislation on his own, so without the support of Congress he’s a lame duck even without removal.
        
        I think you mean it is because of that, not that it isn’t? But let me know if I’ve misunderstood you. I agree so far as legislation is concerned, though of course the president has a a huge amount of power beyond the ability to legislate.
        There are more differences than you mention. The PM is less hindered by the independent judiciary than the president. The PM in a Westminster system also exerts greater control over the individual legislators via his party than in the American system. The PM can serve for an unlimited time, and call elections at strategic moments, while Trump is limited to two terms. All these things increase the power of the PM and the risk of oppressive rule in Westminster-style parliamentary systems.
        I agree that some of these are differences giving a PM more power, in particular the ability to serve indefinitely and call elections strategically (which seems quite bad). The rest do not seem to me to be inherent in parliamentarianism, and indeed it is not clear to me that they are even tendencies.
        This is a recent historical trend and not a defining feature of the system itself.
        It’s not just a historical trend within the US though, but an observed tendency of other presidential systems, and does follow somewhat from the game-theoretic logic of that system.
        Arjun Panickssery 28 Jan 2026 3:54 UTC
        2 points
        0
        Parent
        I think you mean it is because of that, not that it isn’t?
        Yes, that’s a typo.
        It’s not just a historical trend within the US though, but an observed tendency of other presidential systems
        This is too historically contingent. Presidential systems have dominated the less stable American and African countries while European and Asian countries that have been more stable more often have parliaments. I’m not convinced that there is empirical evidence of this kind.
        I agree that parliaments have a much more intuitive nature. Corporations are run with a sovereign board who appoints a dictatorial CEO, not with independent branches of power in a balance.
        Why do you think it’s better to have term limits?
- Garrett Baker 23 Jan 2026 18:04 UTC
  12 points
  2
  Parent
  Yeah, I remember in high school civics I could not understand in what sense the tripartite system of government we have constituted a “balance of powers”, when the only branch of government with any meaningful amount of guns was the executive, ruled by a singular president ^[1] .
  
  Until very recently it felt like a miracle anything worked at all, and my impression is that it worked so well in the past because congress had much much more day-to-day decision making power and was much more plugged into the information sources, then the “seniority system” was instantiated, congress became senile, and FDR got unprecedented control over the war-time economy, and took the opportunity to transfer many decision making roles and bodies from congress to the executive.
  
  When congress is made up of the old and senile, and relies on the president to be their eyes, ears, hands, and brain, it just makes more and more sense to delegate broader and broader powers to the executive, who has the better qualified staff, more information, and a quicker reaction time.
  
  The courts have never been all that powerful, except when they had the implicit backing of the president or congress. When they haven’t clearly had that, my impression is they have made sure not to command the executive to take any meaningful actions.
  1. ↩︎
    My teacher’s response to these questions & my confusion over their responses was kick me out of the classroom into the hallway. I gained quite a positive reputation among students and teachers after that!
  - Eli Tyre 23 Jan 2026 22:11 UTC
    6 points
    2
    Parent
    then the “seniority system” was instantiated, congress became senile, and FDR got unprecedented control over the war-time economy, and took the opportunity to transfer many decision making roles and bodies from congress to the executive.
    I’ve had an inkling that a lot of things that are broken about the US political system can be traced back to congress being ineffective, which can be traced back to power being held predominantly by the most senior congresspeople. But I don’t really know enough to know if this is right, or even the ways in which the “seniority system” has impacted how congress works.
    
    But I would eagerly read a post describing how this change came about and what downstream factors it impacted.
    - Garrett Baker 24 Jan 2026 5:04 UTC
      10 points
      4
      Parent
      
      But I would eagerly read a post describing how this change came about and what downstream factors it impacted.
      
      I cannot recommend more strongly the first three chapters of Robert Caro’s Master of the Senate on this subject. It gives a full political history of the senate, and essentially its fall from grace, starting as the most powerful single component of the US government, hailed the world over for being the most competent and thoughtful political organization on the planet, to its ineptitude becoming the butt of jokes on TV and barely being considered during the signing of routine treaties.
      - Eli Tyre 24 Jan 2026 19:49 UTC
        2 points
        0
        Parent
        The Master of the Senate covers the 50s and early 60s? I thought the seniority system in congress was younger than that.
        Garrett Baker 24 Jan 2026 19:59 UTC
        5 points
        0
        Parent
        Caro is extremely comprehensive and will write small mini-books on the history of every significant institution or person LBJ ever touched. That means that The Master of the Senate begins in like 1810 and gives a complete history of the Senate up until LBJ is elected into the body.
        Elizabeth 24 Jan 2026 19:59 UTC
        4 points
        2
        Parent
        I’m reading Caro’s Path to Power now, and he says seniority system was well established in the house by Johnson’s arrival in the depression.
    - [ ]
      [deleted]
- NunoSempere 25 Jan 2026 19:15 UTC
  9 points
  0
  Parent
  One of the mechanisms that the judiciary has to constrain the military are the Judge Advocate Generals, embedded in the military. These were fired at the beginning of Trump’s term: https://www.jurist.org/features/2025/02/26/explainer-jag-firings-spark-concerns-about-us-military-legal-oversight/
  “When you start firing the military’s top lawyers, that means you are getting ready to order the military to do unlawful things. Trump replaces those JAGs with men who will justify any future unlawful and unethical actions that he wants the military to do,” wrote Democratic political candidate and former US fighter pilot Amy McGrath, also via X.
  Presented with concerns about the implications of these firings, Hegseth said in a Fox News interview, “We want lawyers who give sound constitutional advice and don’t exist to attempt to be roadblocks.”
- Petropolitan 24 Jan 2026 12:09 UTC
  9 points
  2
  Parent
  Dear Kurt, to be honest, please don’t discuss this topic on your citizenship hearing!
  — Einstein to Gödel in 1947, probably
  - Abhinav Gunturi 24 Jan 2026 20:18 UTC
    1 point
    0
    Parent
    I’ve always read the Gödel citizenship anecdote less as “this topic is dangerous” and more as “formal systems have edge cases, please don’t bring them up at awkward social moments.”
    That mostly works, until it doesn’t. Which is true of basically everything humans make up, including citizenship hearings. But wouldn’t life be so boring if we couldn’t gamble?
- leogao 23 Jan 2026 18:09 UTC
  7 points
  1
  Parent
  just like with certain events occurring in November of 2023, it seems like it ultimately comes down to how much pre-existing respect members of the executive branch and military have for the Supreme Court vs the President, and whether the publicly known facts of the dispute seem to obviously favor one side over the other. for example, it seems pretty clear that if trump wanted to serve a third term, and the supreme court says lol no that’s obviously unconstitutional, nobody would listen to Trump even if he could technically fire them.
- MKodama 23 Jan 2026 17:52 UTC
  7 points
  4
  Parent
  My understanding is that the US Marshals are not only accountable to the Court either. They take their day-to-day commands from the Director of the US Marshals Service, a presidential appointee, who in turn reports to the US Attorney General, another presidential appointee.
  This makes me even more doubtful that the US Marshals would side with SCOTUS and, eg, arrest the President in a worst-case constitutional crisis. Both the Marshals’ boss and their boss’s boss would likely side with the President, having been chosen by him for their loyalty.
- Joey KL 23 Jan 2026 7:41 UTC
  7 points
  0
  Parent
  The US Marshalls are charged with carrying out court orders but are actually part of the executive branch, which makes it even less plausible they would materially stand up to the president.
  - Joey KL 24 Jan 2026 19:55 UTC
    1 point
    0
    Parent
    Oh, I didn’t see your footnote. I didn’t know about the US Marshalls vs Marshall of the US Supreme Court distinction. That’s interesting and confusing!
    - habryka 24 Jan 2026 20:43 UTC
      4 points
      0
      Parent
      I edited it after your comment! The original quick take was indeed wrong!
- Josh You 23 Jan 2026 4:09 UTC
  6 points
  0
  Parent
  In-practice most federal offices have deferred to what the Supreme Court says, but we haven’t really seen what happens when e.g. a sitting president insists on an interpretation of the constitution that disagrees, and the constitution itself provides no clear answer.
  This is a somewhat confusing statement. To be clear, it’s extremely common for the president to disagree with courts on the law or Constitution: this happens dozens of times per presidential term. And when they lose in court the president may declare that they still think they are right and the Court ruled incorrectly. But this wouldn’t cause a constitutional crisis or anything by default: the president almost always follows court orders or court opinions. It’s a very ingrained norm in the US that court orders, especially from the Supreme Court, are binding.
  (relevant thread from a lawyer early last year on the powers and tools that courts have to force a president or other federal officials to follow their court orders, such as freezing assets).
  I think there’s a lot of reasoning here that effectively goes “if the president has absolute power such that the military and federal officers will always listen to his orders, then the US legal system will have trouble reigning him in.” Which is kind of just begging the question. But somewhere in the chain of events you suggest, the president would break a lot of clear red lines and probably lose nearly all of his political support from the general population and the powerful elements of society, unless the he has already broadly persuaded people that his power-grabbing actions are actually a good idea.
  - habryka 23 Jan 2026 7:04 UTC
    9 points
    0
    Parent
    the president almost always follows court orders or court opinions. It’s a very ingrained norm in the US that court orders, especially from the Supreme Court, are binding.
    Sorry if this wasn’t clear. The whole point of this exploration is to figure out what happens when the president does not follow court orders. I will adjust the intro to clarify that.
    I agree this would be approximately unprecedented! But it seems very much a scenario worth exploring. I made these edits to make that clearer:
    So, let’s say the supreme court wants to stop a sitting president from destroying democracy in America. First, they release a judgement saying that something the executive branch is doing is unconstitutional. Hopefully the U.S. president agrees and then just stops doing that. But what happens when the executive branch keeps doing it anyways?
    [...]
    So, with this background knowledge, I see roughly 4 big ways the supreme court can try to reign in an out of control executive branch that isn’t listening to a judgement they made:
    Hope that makes it clearer to future readers!
  - Arjun Panickssery 23 Jan 2026 20:50 UTC
    4 points
    0
    Parent
    But this wouldn’t cause a constitutional crisis or anything by default: the president almost always follows court orders or court opinions.
    There’s some complexity because historically the Supreme Court also tends to show restraint by not making rulings that it doesn’t expect people to follow.
    I think in general when discussing these “Constitutional crisis” topics it helps to
    Not think too much in terms of formalities but only in terms of norms
    Look back in history for what it precedented or unprecedented, partly because this will also decouple the discussion from debates about current political questions/actors
  - Elizabeth 23 Jan 2026 6:10 UTC
    4 points
    0
    Parent
    (relevant thread from a lawyer early last year on the powers and tools that courts have to force a president or other federal officials to follow their court orders, such as freezing assets).
    “Force” seems strong compared to what the thread says. He starts with “no chance of the White House successfully refusing to comply” but for every mechanism except freezing assets, he caveats with “this might work”
- Josh You 23 Jan 2026 16:17 UTC
  5 points
  2
  Parent
  Another point here is that elections are an additional check after the courts, Congress, etc. US presidential elections are not administered by the federal government, they are administered by the states. So to interfere with elections, the president can’t just fill election boards with cronies or give orders to anyone in his chain of command to rig the election. He’d have to forcibly manipulate or interfere with state officials and state governments, risking direct conflict with states. And if he doesn’t interfere with the election and the states announce results showing he lost in a landslide, his political power almost certainly evaporates. Of course, if all the president’s crazy actions are in fact popular, then he is much more likely to succeed and stay in power for many reasons.
  Again, if you assume the military always slavishly follows the president, then this ends up in a civil war with a plausible military victory for the president. But each escalation into “this is obviously illegitimate” means the president increasingly offends his generals’ sense of duty, decreases the probability of success and increases the legal and political risk for the officers following his orders, increases the size and motivation of the inevitable popular resistance, etc.
  - habryka 24 Jan 2026 19:37 UTC
    2 points
    0
    Parent
    Isn’t the obvious thing to do here to just imprison/jail/deport/exile your political opponents? The supreme court will of course object, but that’s the whole scenario we are playing out here. My sense is this a relatively common thing to do if a president wants to stay in power.
    But each escalation into “this is obviously illegitimate” means the president increasingly offends his generals’ sense of duty, decreases the probability of success and increases the legal and political risk for the officers following his orders, increases the size and motivation of the inevitable popular resistance, etc.
    I agree that there is some broad sense in which this must be true, but I do think this hasn’t so far been particularly true in this administration? Maybe not super worth going into a ton of local political details, but I think history more broadly also shows that in many cases you can make up for doing things that are obviously illegitimate by looking like a bold, strong and decisive leader, and by threatening force to anyone who opposes you. So I don’t really buy there is the nice linear correlation that you say there is here.
- Eli Tyre 23 Jan 2026 22:00 UTC
  4 points
  −4
  Parent
  and understanding how seriously they would take supreme court making a clear judgement (and e.g. would be open to protecting U.S. marshalls while they enforce supreme court judgement) seems like one of the most crucial questions.
  I agree.
  
  Also this is a scary question to investigate, because (on my current model, as described by this book), this is a Keynesian beauty contest—how almost everyone will act depends on how they expect almost everyone to act. Trying to get clarity about the question of how seriously the members of the armed forces take their oath to the constitution, or how they interpret the meaning of that oath, is much less of a neutral act than most exercises in figuring something out (even taking for granted that figuring stuff out often has implications for political conflicts, as the contextualizers cry, this question is particularly politically laden).
  For this question more than most, the prediction market is probably a self-fulfilling prophecy. Which doesn’t mean that you shouldn’t have a prediction market, but it does seem like you should contend with the self- fulfilling nature somehow.
  
  I feel on shaky ground here. It seems plausible to me that, if I have opportunity to, I should mostly not try to predict the answer to this question, I should mostly just try to reinforce the equilibrium of “the armed forces first loyalty is to the constitution.”
  
  Or at least, I feel like I don’t have a developed philosophy of how to deal with questions that are mix of epistemic predictions and coordination-game.
  
  To be clear: were I to take the above stance, I would continue to refrain from ever lying. Though I might also refrain from answering some classes of questions. (Also this is probably an academic point because I’m not likely to have much influence on what the US armed forces believe about what others in the US armed forces will do.)
  
  I bet @Andrew Critch and @Richard_Ngo have thought about this question.
- ChristianKl 23 Jan 2026 13:18 UTC
  4 points
  2
  Parent
  When militaries consider the president to be illegitime and bad for the country, historically they frequently create military coups.
- jmh 23 Jan 2026 11:55 UTC
  4 points
  0
  Parent
  Could you point to your source for the claim about the Marshall’s Service falling under the Judicial Branch of the government? My understanding is that his belongs to the DoJ so would fall under the Executive Branch.
  Separately, I do wonder if we’re speculating about cases that might be labeled in the gray area of the incomplete contract (Constitution), I wonder what might happen if States claim their right to call out their National Guard and perhaps even the more general malitia (interesting if that could be State draft or purely voluntary—i.e., giving military arms to able bodied men), President calls out military, and then Congress tell all the military their pay is frozen—meaning not only DoD and it’s branches but the service men and any contractors—what might happen.
  If Treasury just says go ef’ yourself Congress and cuts the checks not much hope. But what if the banking system refused to honor them given the S.C and Congress’s rulings?
  Seems like at this point we’re talking about some serious brinkmanship, and to be honest I would really prefer not to live in such times (like many actually get a choice here) given the potential for escalation to all out civil war. But I do wonder if perhaps the bigger checks here might not be the informal checks and balances. It seems that perhaps in the scenario envisions (as I understand it—a serious breakdown in government processes and checks-balance among the branches) even applying any presumably defined law or division of power is very problematic—which is a bit different from saying the other branches should not try.
  But I would also think (as seems true today) you simply don’t get to the situation suggested without the government processes and functions related to checks and balances already having deteriorated to the point of disfunction—which I would suggest is the case and has been developing for many years -- 50? 100? We’ve seen a lot of political structure innovation that is not quite consistent with the Constitution (Congressional delegation of powers, partnership among the branches for efficiency reasons, party domination that serves to eliminate the assumed checks and balances...).
  - habryka 23 Jan 2026 20:56 UTC
    8 points
    0
    Parent
    Could you point to your source for the claim about the Marshall’s Service falling under the Judicial Branch of the government? My understanding is that his belongs to the DoJ so would fall under the Executive Branch.
    Source: I made it up!
    Apparently I was wrong. There is a Marshal under the direct control of the supreme court, but it’s just a single guy, who does control a police force, but the mandate of that police force is to protect the supreme court, not to enforce orders. I’ll try to update the post with my new understanding tonight.
    - jmh 24 Jan 2026 0:13 UTC
      2 points
      0
      Parent
      Source: I made it up!
      LOL—HSI hallucinations?
      No worries, we all make some mistakes with our assumptions at times and forget to double check every fact. I think it was a minor, and largely trivial error to the larger point. i just wasn’t sure and did a quick google check (so had Gemini answering, but I’ve seen it hallucinate enough to not take it as certian) but that can easily miss some finer points.
      - habryka 24 Jan 2026 0:39 UTC
        5 points
        0
        Parent
        I confused the Supreme Court Marshal and the U.S. Marshals. It’s particularly easy to confuse them because the job of the U.S. Marshals is to enforce court orders, it just happens to be under the control of the executive.
  - Garrett Baker 23 Jan 2026 18:00 UTC
    5 points
    2
    Parent
    
    Could you point to your source for the claim about the Marshall’s Service falling under the Judicial Branch of the government? My understanding is that his belongs to the DoJ so would fall under the Executive Branch.
    
    I believe this is a minor mistake, see my other comment.
  - Garrett Baker 23 Jan 2026 17:47 UTC
    4 points
    0
    Parent
    
    what if the banking system refused to honor them given the S.C and Congress’s rulings?
    
    This is one reason why the independence of the FED is important. The US has a very centralized banking system which is technically governed by the executive branch.
- Oliver Sourbut 23 Jan 2026 8:22 UTC
  4 points
  0
  Parent
  I think there’s a magic ^[1] that the military is somehow also fairly firmly aligned with the constitution and non-partisan, though nominally also under the president’s command. I don’t really get it, and I don’t know how much this helps.
  1. ↩︎
    and it is magic, as in it’s an inexplicable (to me) and presumed-contingent (social) technology
  - TsviBT 23 Jan 2026 21:21 UTC
    16 points
    0
    Parent
    As a piece of info about the current real-life situation, there was an episode recently where some members of Congress made a PSA saying that military people should not follow unlawful orders, and then the president called for them to be locked up (or executed). https://www.deseret.com/politics/2025/11/20/trump-responds-as-democratic-lawmakers-direct-video-message-at-troops/?utm_source=chatgpt.com
    - Shankar Sivarajan 24 Jan 2026 6:24 UTC
      −3 points
      −5
      Parent
      For all that they’re now playing coy, they were clearly implying the President had given unlawful orders, and were calling on the military to disobey him.
      - Haiku 25 Jan 2026 23:07 UTC
        2 points
        0
        Parent
        I wouldn’t be so sure that e.g. Mark Kelly was implying that the President himself had given unlawful orders. (I am open to evidence that this is what was being implied, or that this actually occurred.) The boat double-tap incident in particular suggested that unlawful orders may have been given by someone in the chain of command. Minus any speculative or actual nth-order effects, I think it was a sensible time to remind service members not to follow unlawful orders.
        And of course, the POTUS himself frequently declines to defer to laws that would constrain him, so the idea that he might give unlawful orders shouldn’t be surprising to people in any given political camp.
      - Viliam 26 Jan 2026 11:16 UTC
        −3 points
        0
        Parent
        The video is 83 seconds long, anyone who wants to express an opinion on this topic might want to see it first.
  - Garrett Baker 23 Jan 2026 17:42 UTC
    7 points
    2
    Parent
    I have heard this claim before, but have never been given a reason to believe it. Why do you believe the military is under the spell of “listen to the constitution” and not “listen to orders”? The latter seems a far more prevalent sentiment given the practical fact that often if you don’t listen to orders, you and your friends will often wind up dead.
  - jmh 23 Jan 2026 12:01 UTC
    5 points
    0
    Parent
    Generally I do agree but given the current Secretary and some of the appointees I would question how strong that “magic” might be. Do you think some Generals or Armies/Divisions would rise up to oppose some core units that are aligned with such a President/Administration? At what point might they do that—the first case of S.C ruling something unconstitutional but the President continues? Or is it more likely they stay in their place and we just keep sliding down the slippery slope?
- Zach Stein-Perlman 24 Jan 2026 22:47 UTC
  2 points
  2
  Parent
  Importantly, as far as I can tell, from a purely constitutional perspective the supreme court has no more authority to direct any members of the executive branch to do anything than I do. Their only constitutional power is to call for federal officers to refuse to do something. Asking them to do anything proactive would go relatively clearly against their mandate.
  In case nobody else has mentioned it: this is false, courts order actions all the time, e.g. ordering agencies to do stuff that they’re legally required to do. Ask your favorite chatbot.
  - habryka 25 Jan 2026 5:11 UTC
    2 points
    0
    Parent
    I don’t think there is any authority here from a constitutionalist perspective? Like, the supreme court can order “the executive” to do something (and it might direct that order at a smaller part of the executive), but if the president disagrees, the constitution seems pretty clear that the job of the relevant executive agency would be to at most do nothing. Going directly against presidential orders and taking direct orders from the supreme court would be a pretty clear constitutional violation, at least as far as my understanding goes.
    - Zach Stein-Perlman 25 Jan 2026 5:23 UTC
      4 points
      0
      Parent
      I would bet that lawyers, scholars, and chatbots would basically disagree with you. Your literal reading of the Constitution is less important than norms/practice.
      - habryka 25 Jan 2026 6:10 UTC
        3 points
        0
        Parent
        My post is framed centrally as constitutionalist analysis, so I was trying to not get too bogged down in precedent and practicalities, which are just much harder to model (though of course the line here is blurry).
        That said, after thinking and reading more about it, I still changed my mind at least a bit. The key thing I wasn’t modeling is the Supreme Court’s ability to declare injunctions against specific government officers, exposing them to more personal liability. Even if the executive doesn’t cooperate, the court can ask civilian institutions like banks to freeze their bank accounts or similar things, and my guess is many of them would comply.
        I rewrote the relevant section to reflect my updated understanding. Let me know if anything still seems wrong by your lights.
- lilkim2025 23 Jan 2026 10:11 UTC
  2 points
  −9
  Parent
  I’ve mentioned before that both sides of this conflict see a clear precedent of unconstitutional action by their opponents that would destroy everything they care about if left unchecked. All of the processes described by OP would be taken as a coup by the side targeted by them.
  I’ve recently noticed a recent surge in very partisan posts on LessWrong, generally similar in tone and content to this one, in which one side’s perspective is presented as unassailable fact and the other’s is not mentioned. This is dangerous, both in the sense that we have seen many communities elsewhere^[1] lose the things that made them unique after being taken over by partisan political content, and in the very literal sense that American politics is at a breaking point right now, and encouraging unwise action could have very real consequences for very large numbers of people. “The military should renounce the elected president and fight against the government” is not something to say lightly, and, regardless of who won the resulting conflict, life would be perilous and uncomfortable for everyone living in America for several decades thereafter.
  I realize this probably isn’t in line with the sentiments of most of the comments section on this post, but I would ask that you consider an extension of Chesterton’s Fence: “Do not, directly or indirectly, declare a large group of people to be your enemy until you can explain, from their perspective, why they are doing what they are doing.”
  1. ^
    (see the comments, in which many very Reddit users, most of them left-leaning, lament what has become of much of their website)
  - habryka 23 Jan 2026 21:38 UTC
    20 points
    7
    Parent
    I was really trying to write this post largely from a “what would be the options for the judicial branch” in a generic way where it would apply to many presidencies, and trying to keep specific partisan judgements out of it.
    To be clear, I do think pretty scary things are happening with U.S. democracy right now, and my motivation and attention is driven by what makes sense to do about a Trump presidency, but I still think it’s usually best to keep things focused on more general principles that could apply to many situations.
    “The military should renounce the elected president and fight against the government” is not something to say lightly, and, regardless of who won the resulting conflict, life would be perilous and uncomfortable for everyone living in America for several decades thereafter.
    Totally! And just for the sake of clarity, I absolutely do not think the current military should renounce the elected president and fight against the executive branch (you used the word “government” but to be clear, the supreme court and the states are also the government!). I do think what the actual military is supposed to do from a constitutionalist perspective when different parts of the government disagree and give conflicting orders is quite important and a pretty tricky question that I didn’t know the answer to before I researched and wrote this (and still have a lot of uncertainty on).
    - RHollerith 25 Jan 2026 1:02 UTC
      3 points
      −13
      Parent
      
      you used the word “government” but to be clear, the supreme court and the states are also the government!
      
      In British English, “the government” means the executive branch, and the entire thing (including the judiciary and the legislature) is called the state.
      
      And while I have your attention allow me to echo the person you are replying to, namely, it would be ideal if a reader could not even tell from your comments which party you prefer (since you run the site) and great grandparent is pretty strong evidence for which one.
      - habryka 25 Jan 2026 5:14 UTC
        12 points
        8
        Parent
        Why… would that be ideal? I certainly do not consider my opinions on policy and politics to be forbidden on this site? The topic of politics itself should be approached with care, but certainly it would be if anything a pretty bad violation of what I would consider good conduct if people systematically kept their opinions on politics and policy hidden. Those things matter!
        RHollerith 25 Jan 2026 20:47 UTC
        2 points
        0
        Parent
        My worry is one or two people loyal to the red team leave the site, which makes people on the blue team feel more free to use the site to criticize the red team, causing more red teamers to leave (and attracting blue-team zealots who filter everything through an ideological lens) in a positive feedback loop ending in a site with the same problem as Bluesky already has and many subreddits already have, namely, the zealots produce large quantities of low-quality writing, which drowns out the high-quality contributions and discourages many who can make high-quality contributions from even starting to contribute.
        
        ADDED. Since LW is currently very far from Bluesky, perhaps it would’ve been more persuasive for me to argue that if LW were to start to have even half as many low-effort political comments as Hacker News, many would probably stop reading LW, or at least that is my worry.
        habryka 26 Jan 2026 3:50 UTC
        2 points
        0
        Parent
        Yeah, definitely agree. I just think the standard of “admins should comment in a way that makes it impossible to tell what their political opinions are” is not the best tool to achieve this. I think it’s better for people to be open about their views, and also try really hard to be principled and fair.
      - tslarm 25 Jan 2026 1:19 UTC
        6 points
        4
        Parent
        please don’t even imply that it is natural for a LW reader to prefer one of the US political parties over the other.
        On what grounds? There’s always been a norm on LW of treating some highly controversial questions as basically settled. Good-faith disagreement will still be heard and engaged with, but it’s normal to, for example, take atheism for granted. The same goes for values-based disagreements; it’s taken for granted that some versions of the future are obviously preferable to others. So if one US political party is, factually, working against the values of most LW readers much harder than the other one, why is it off-limits to make comments discussing the implications of that?
        RHollerith 25 Jan 2026 1:34 UTC
        2 points
        0
        Parent
        https://www.lesswrong.com/posts/9weLK2AJ9JEt2Tt8f/politics-is-the-mind-killer
        tslarm 25 Jan 2026 6:22 UTC
        5 points
        3
        Parent
        https://www.lesswrong.com/posts/9weLK2AJ9JEt2Tt8f/politics-is-the-mind-killer
        Yes, I’ve read it, and it doesn’t say what you seem to be implying it says.
        If you want to make a point about science, or rationality, then my advice is to not choose a domain from contemporary politics if you can possibly avoid it.
        [my emphasis, here and below]
        I’m not saying that I think we should be apolitical, or even that we should adopt Wikipedia’s ideal of the Neutral Point of View. But try to resist getting in those good, solid digs if you can possibly avoid it.
        Eliezer’s main point was that we should avoid unnecessary politics, especially cheap political digs that may please some readers but risk needlessly alienating others. Here, the thing being discussed is inherently political and inseparable from the partisan divide.
        habryka 25 Jan 2026 8:16 UTC
        4 points
        0
        Parent
        I do want to avoid gaslighting people. LessWrong and LessWrong 2.0 under my management has discouraged U.S. politics content for many years. We stopped around 4-5 years ago, as politics started being more relevant to many people’s goals on the site, though we still don’t allow it on the LW frontpage unless it tries pretty hard to keep things timeless and non-partisan.
        tslarm 25 Jan 2026 10:14 UTC
        4 points
        1
        Parent
        Fair, but I see this as two distinct things:
        Politics is the Mind-Killer: still applies; protects this forum from redditification and encourages us to avoid pointlessly alienating people/making enemies of each other.
        US politics posting allowed/discouraged/banned: I’m not too fussed about where you set this dial. But if political discussion is going to happen here, I think it would be bad if you/we got pressured into bothsidesism (which could happen if PitMK is misrepresented as a prohibition on openly taking partisan-coded positions).
        Shankar Sivarajan 25 Jan 2026 1:35 UTC
        0 points
        −2
        Parent
        why is it off-limits to make comments discussing the implications of that
        Spending the better part of two decades harping on about how precisely that is the mind-killer makes it a little tricky to reverse that position.
  - Elizabeth 23 Jan 2026 18:19 UTC
    13 points
    5
    Parent
    I greatly appreciate the context you provided in the linked comment, and in general the attempt to explain why an underrepresented side views their choices as reasonable or necessary. I want to do what I can to support you continuing to bring up counterpoints and things people are missing.
    This particular post reads to me as president-neutral, in that you could post it on a conservative-leaning forum under a democratic president and it would look equally in tune with local culture. Maybe I’m wrong about that, it’s easy to read things that match one’s own worldview as neutral, in which case I’m asking for specifics on what makes this not neutral.
    One guess, based on your other comment, is that Habryka takes the legitimacy of the court for granted, in which case I’d like to dig into more detail on that.
  - jmh 23 Jan 2026 12:17 UTC
    13 points
    5
    Parent
    I think it is very easy to read into a post like this and essentially fall into the very behavior you’re ascribing to the author. Regardless of the OP’s view, the post is not naming names but is very topical. It’s worth considering.
    But I do agree that whoever is getting told their actions are unconstitutional will typically see that as an attack if they truly believe they are doing something within their powers. But I also suspect any that refuse to accept a Supreme Court ruling never cared about the Constitution or the checks and balances that were implemented in the Constitution. It’s simply a case of someone refusing to accept they are not a good judge of their own case which is pretty much at the heart of any rule of law society.
habryka 9 May 2019 19:12 UTC
82 points
6
Thoughts on integrity and accountability
[Epistemic Status: Early draft version of a post I hope to publish eventually. Strongly interested in feedback and critiques, since I feel quite fuzzy about a lot of this]
When I started studying rationality and philosophy, I had the perspective that people who were in positions of power and influence should primarily focus on how to make good decisions in general and that we should generally give power to people who have demonstrated a good track record of general rationality. I also thought of power as this mostly unconstrained resource, similar to having money in your bank account, and that we should make sure to primarily allocate power to the people who are good at thinking and making decisions.
That picture has changed a lot over the years. While I think there is still a lot of value in the idea of “philosopher kings”, I’ve made a variety of updates that significantly changed my relationship to allocating power in this way:
- I have come to believe that people’s ability to come to correct opinions about important questions is in large part a result of whether their social and monetary incentives reward them when they have accurate models in a specific domain. This means a person can have extremely good opinions in one domain of reality, because they are subject to good incentives, while having highly inaccurate models in a large variety of other domains in which their incentives are not well optimized.
- People’s rationality is much more defined by their ability to maneuver themselves into environments in which their external incentives align with their goals, than by their ability to have correct opinions while being subject to incentives they don’t endorse. This is a tractable intervention and so the best people will be able to have vastly more accurate beliefs than the average person, but it means that “having accurate beliefs in one domain” doesn’t straightforwardly generalize to “will have accurate beliefs in other domains”.
  
  One is strongly predictive of the other, and that’s in part due to general thinking skills and broad cognitive ability. But another major piece of the puzzle is the person’s ability to build and seek out environments with good incentive structures.
- Everyone is highly irrational in their beliefs about at least some aspects of reality, and positions of power in particular tend to encourage strong incentives that don’t tend to be optimally aligned with the truth. This means that highly competent people in positions of power often have less accurate beliefs than much less competent people who are not in positions of power.
- The design of systems that hold people who have power and influence accountable in a way that aligns their interests with both forming accurate beliefs and the interests of humanity at large is a really important problem, and is a major determinant of the overall quality of the decision-making ability of a community. General rationality training helps, but for collective decision making the creation of accountability systems, the tracking of outcome metrics and the design of incentives is at least as big of a factor as the degree to which the individual members of the community are able to come to accurate beliefs on their own.
A lot of these updates have also shaped my thinking while working at CEA, LessWrong and the LTF-Fund over the past 4 years. I’ve been in various positions of power, and have interacted with many people who had lots of power over the EA and Rationality communities, and I’ve become a lot more convinced that there is a lot of low-hanging fruit and important experimentation to be done to ensure better levels of accountability and incentive-design for the institutions that guide our community.
I also generally have broadly libertarian intuitions, and a lot of my ideas about how to build functional organizations are based on a more start-up like approach that is favored here in Silicon Valley. Initially these intuitions seemed at conflict with the intuitions for more emphasis on accountability structures, with broken legal systems, ad-hoc legislation, dysfunctional boards and dysfunctional institutions all coming to mind immediately as accountability-systems run wild. I’ve since then reconciled my thoughts on these topics a good bit.
Integrity
Somewhat surprisingly, “integrity” has not been much discussed as a concept handle on LessWrong. But I’ve found it to be a pretty valuable virtue to meditate and reflect on.
I think of integrity as a more advanced form of honesty – when I say “integrity” I mean “acting in accordance with your stated beliefs.” Where honesty is the commitment to not speak direct falsehoods, integrity is the commitment to speak truths that actually ring true to yourself, not ones that are just abstractly defensible to other people. It is also a commitment to act on the truths that you do believe, and to communicate to others what your true beliefs are.
Integrity can be a double-edged sword. While it is good to judge people by the standards they expressed, it is also a surefire way to make people overly hesitant to update. If you get punished every time you change your mind because your new actions are now incongruent with the principles you explained to others before you changed your mind, then you are likely to stick with your principles for far longer than you would otherwise, even when evidence against your position is mounting.
The great benefit that I experienced from thinking of integrity as a virtue, is that it encourages me to build accurate models of my own mind and motivations. I can only act in line with ethical principles that are actually related to the real motivators of my actions. If I pretend to hold ethical principles that do not correspond to my motivators, then sooner or later my actions will diverge from my principles. I’ve come to think of a key part of integrity being the art of making accurate predictions about my own actions and communicating those as clearly as possible.
There are two natural ways to ensure that your stated principles are in line with your actions. You either adjust your stated principles until they match up with your actions, or you adjust your behavior to be in line with your stated principles. Both of those can backfire, and both of those can have significant positive effects.
Who Should You Be Accountable To?
In the context of incentive design, I find thinking about integrity valuable because it feels to me like the natural complement to accountability. The purpose of accountability is to ensure that you do what you say you are going to do, and integrity is the corresponding virtue of holding up well under high levels of accountability.
Highlighting accountability as a variable also highlights one of the biggest error modes of accountability and integrity – choosing too broad of an audience to hold yourself accountable to.
There is tradeoff between the size of the group that you are being held accountable by, and the complexity of the ethical principles you can act under. Too large of an audience, and you will be held accountable by the lowest common denominator of your values, which will rarely align well with what you actually think is moral (if you’ve done any kind of real reflection on moral principles).
Too small or too memetically close of an audience, and you risk not enough people paying attention to what you do, to actually help you notice inconsistencies in your stated beliefs and actions. The smaller the group that is holding you accountable is, the smaller your inner circle of trust, which reduces the amount of total resources that can be coordinated under your shared principles.
I think a major mistake that even many well-intentioned organizations make is to try to be held accountable by some vague conception of “the public”. As they make public statements, someone in the public will misunderstand them, causing a spiral of less communication, resulting in more misunderstandings, resulting in even less communication, culminating into an organization that is completely opaque about any of its actions and intentions, with the only communication being filtered by a PR department that has little interest in the observers acquiring any beliefs that resemble reality.
I think a generally better setup is to choose a much smaller group of people that you trust to evaluate your actions very closely, and ideally do so in a way that is itself transparent to a broader audience. Common versions of this are auditors, as well as nonprofit boards that try to ensure the integrity of an organization.
This is all part of a broader reflection on trying to create good incentives for myself and the LessWrong team. I will probably follow this up with a post that more concretely summarizes my thoughts on how all of this applies to LessWrong concretely.
In summary:
- One lens to view integrity through is as an advanced form of honesty – “acting in accordance with your stated beliefs.”
  - To improve integrity, you can either try to bring your actions in line with your stated beliefs, or your stated beliefs in line with your actions, or reworking both at the same time. These options all have failure modes, but potential benefits.
- People with power sometimes have incentives that systematically warp their ability to form accurate beliefs, and (correspondingly) to act with integrity.
- An important tool for maintaining integrity (in general, and in particular as you gain power) is to carefully think about what social environment and incentive structures you want for yourself.
- Choose carefully who, and how many people, you are accountable to:
  - Too many people, and you are limited in the complexity of the beliefs and actions that you can justify.
  - Too few people, too similar to you, and you won’t have enough opportunities for people to notice and point out what you’re doing wrong. You may also not end up with a strong enough coalition aligned with your principles to accomplish your goals.
What links here?
- Raemon 12 May 2019 2:57 UTC
  15 points
  0
  Parent
  Just wanted to say I like this a lot and think it’d be fine as a full fledged post. :)
  - Zvi 2 Jun 2019 11:36 UTC
    6 points
    0
    Parent
    More than fine. Please do post a version on its own. A lot of strong insights here, and where I disagree there’s good stuff to chew on. I’d be tempted to respond with a post.
    I do think this has a different view of integrity than I have, but in writing it out, I notice that the word is overloaded and that I don’t have as good a grasp of its details as I’d like. I’m hesitant to throw out a rival definition until I have a better grasp here, but I think the thing you’re in accordance with is not beliefs so much as principles?
  - Eli Tyre 2 Jun 2019 9:45 UTC
    1 point
    0
    Parent
    Seconded.
    - Kaj_Sotala 2 Jun 2019 15:45 UTC
      2 points
      0
      Parent
      Thirded.
      - Saul Munn 7 Jul 2024 20:15 UTC
        1 point
        0
        Parent
        fourthed. oli, do you intend to post this?
        
        if not, could i post this text as a linkpost to this shortform?
        habryka 7 Jul 2024 20:16 UTC
        2 points
        0
        Parent
        It’s long been posted!
        Integrity and accountability are core parts of rationality
        Saul Munn 7 Jul 2024 20:59 UTC
        1 point
        0
        Parent
        ah, lovely! maybe add that link as an edit to the top-level shortform comment?
- Eli Tyre 2 Jun 2019 9:45 UTC
  10 points
  0
  Parent
  This was a great post that might have changed my worldview some.
  Some highlights:
  1.
  People’s rationality is much more defined by their ability to maneuver themselves into environments in which their external incentives align with their goals, than by their ability to have correct opinions while being subject to incentives they don’t endorse. This is a tractable intervention and so the best people will be able to have vastly more accurate beliefs than the average person, but it means that “having accurate beliefs in one domain” doesn’t straightforwardly generalize to “will have accurate beliefs in other domains”.
  I’ve heard people say things like this in the past, but haven’t really taken it seriously as an important component of my rationality practice. Somehow what you say here is compelling to me (maybe because I recently noticed a major place where my thinking was majorly constrained by my social ties and social standing) and it prodded me to think about how to build “mech suits” that not only increase my power but incentivize my rationality. I now have a todo item to “think about principles for incentivizing true beliefs, in team design.”
  2.
  I think a generally better setup is to choose a much smaller group of people that you trust to evaluate your actions very closely,
  Similarly, thinking explicitly about which groups I want to be accountable to sounds like a really good idea.
  I had been going through the world keeping this Paul Graham quote in mind...
  I think the best test is one Gino Lee taught me: to try to do things that would make your friends say wow. But it probably wouldn’t start to work properly till about age 22, because most people haven’t had a big enough sample to pick friends from before then.
  ...choosing good friends, and and doing things that would impress them.
  But what you’re pointing at here seems like a slightly different thing. Which people do I want to make myself transparent to, so that they can judge if I’m living up to my values.
  This also gave me an idea for a CFAR style program: a reassess your life workshop, in which a small number of people come together for a period of 3 days or so, and reevaluate cached decisions. We start by making lines of retreat (with mentor assistance), and then look at high impact questions in our life: given new info, does your current job / community / relationship / life-style choice / other still make sense?
  Thanks for writing.
- mako yass 12 May 2019 8:41 UTC
  3 points
  0
  Parent
  I think you might be confusing two things together under “integrity”. Having more confidence in your own beliefs than the shared/imposed beliefs of your community isn’t really a virtue or.. it’s more just a condition that a person can be in, whether it’s virtuous is completely contextual. Sometimes it is, sometimes it isn’t. I can think of lots of people who should have more confidence other peoples’ beliefs than they have in their own. In many domains, that’s me. I should listen more. I should act less boldly. An opposite of that sense of integrity is the virtue of respect- recognising other peoples’ qualities- it’s a skill. If you don’t have it, you can’t make use of other peoples’ expertise very well. A superfluence of respect is a person who is easily moved by others’ feedback, usually, a person who is patient with their surroundings.
  On the other hand I can completely understand the value of {having a known track record of staying true to self-expression, claims made about the self}. Humility is actually a part of that. The usefulness of deliniating that into a virtue separate from the more general Honesty is clear to me.
  - Pattern 4 Jun 2019 19:43 UTC
    3 points
    0
    Parent
    There’s a lot of focus on personally updating based on evidence. Groups aren’t addressed as much. What does it mean for a group to have a belief? To have honesty or integrity?
- Jasnah Kholin 28 Sep 2025 11:13 UTC
  1 point
  0
  Parent
  post I hope to publish eventually
  did you publish it eventually?
  - habryka 28 Sep 2025 17:59 UTC
    3 points
    0
    Parent
    Yep!
    
    Integrity and accountability are core parts of rationality
- ioannes 19 May 2019 15:58 UTC
  1 point
  0
  Parent
  See Sinclair: “It is difficult to get a man to understand something, when his salary depends upon his not understanding it!”
habryka 28 Jun 2025 17:53 UTC
77 points
9
Gary Marcus asked me to make a critique of his 2024 predictions, for which he claimed that he got “7/7 correct”. I don’t really know why I did this, but here is my critique:
For convenience, here are the predictions:
- 7-10 GPT-4 level models
- No massive advance (no GPT-5, or disappointing GPT-5)
- Price wars
- Very little moat for anyone
- No robust solution to hallucinations
- Modest lasting corporate adoption
- Modest profits, split 7-10 ways
I think the best way to evaluate them is to invert every one of them, and then see whether the version you wrote, or the inverted version seems more correct in-retrospect.
We will see 7-10 GPT-4 level models.
Inversion: We will either see less than 7 GPT-4 level models, or more than 10 GPT-4 level models.
Evaluation: Conveniently Epoch did an evaluation of almost this exact question!
https://epoch.ai/data-insights/models-over-1e25-flop
Training compute is not an ideal proxy for capabilities, but it’s better than most other simple proxies.
Models released in 2024 with GPT-4 level compute according to Epoch:
```
Inflection-2, GLM-4, Mistral Large, Aramco Metabrain AI, Inflection-2.5, Nemotron-4 340B, Mistral Large 2, GLM-4-Plus, Doubao-pro, Llama 3.1-405B, grok-2, Claude 3 Opus, Claude 3.5 Sonnett, Gemini 1.0 Ultra, Gemini 1.5 Pro, Gemini 2.0 Pro, GPT 4o, o1-mini, o1, o3 (o3 was announced and published but not released until later in the year)
```
They also list 22 models which might be over the 10^25 FLOP threshold that GPT-4 was trained with. Many of those will be at GPT-4 level capabilities, because compute-efficiency has substantially improved.
Counting these models, I get 20+ models at GPT-4 level (from over 10 distinct companies).
I think your prediction seems to me to have somewhat underestimated the number of GPT-4 level models that will be released in 2024. I don’t know whether you intended to put more emphasis on the number being low or high, but it definitely isn’t within your range.
No massive advance (no GPT-5, or disappointing GPT-5)
Inversion: There was a massive advance in frontier model AI in 2024.
Evaluation: Given that 2024 was the year of reasoning models, this really seems very straightforwardly false. We saw the biggest advance since roughly transformers itself, with huge changes in scaling laws. The o3 evaluations were even released in 2024, in-line with the final evaluations, suggesting the model had indeed largely finished training.
This was not a prediction about whether an advance would be deployed to consumer models in 2024. And it is very unambiguously the case that we saw a major advance in AI technologies in 2024. This one should be straightforwardly marked as false.
Price wars
Inversion: AI companies do not have to frequently lower their prices to stay competitive with other players trying to undercut them
Yep, seems like there is a lot of aggressive price competition, though not as much as I was honestly expecting. Both Claude and OpenAI have models that cost enormous amounts of money, and being at the frontier means you can charge a huge premium.
This is very much not operationalized enough to really judge it, but I think it’s fine.
Very little moat for anyone
Inversion: Being a leading AI company is a robust position that you will be able to extract large amounts of money from without needing to worry too much about competition
Evaluation: So, I have sympathy for this position, but also, the active players in the AI race have not been changing over the years, which I think would be a strong sign of actual weak moats. In as much as the moats are weak, so far every AI company has succeeded at defending themselves against the competitors.
But that not withstanding, I would probably resolve this ambiguously slightly favoring Gary? If someone wanted to argue that “being at a leading lab now is the most important thing because they will probably continue to be in the lead” I would see some pretty compelling arguments for that, which seems like the opposite of this prediction.
No robust solution to hallucinations
Yep, this seems unambiguously correct, no need to argue much.
Modest lasting corporate adoption
Inversion: AI corporate adoption will either be very close to zero, or among the most rapid adoptions of any technology in US history
Evaluation: We are absolutely, with no ambiguity, in the “most rapid adoptions of any technology in US history branch”. Every single corporation in the world is trying to adopt AI into their products. Even extremely slow-moving industries are rushing to adopt AI.
This one unambiguously resolves as false. I honestly have trouble imagining any world where corporate adoption is even faster. Trying to spin this as a positive prediction seems absolutely hilarious. I literally cannot think of a technology for which this could possibly be falsified, if it cannot be falsified for AI.
Modest profits, split 7-10 ways
Inversion: Companies investing heavily in AI will either see very low returns, or very high returns. There will only be either a very small or very large number of players who make up the majority of profit.
Evaluation: Profits are extremely high industry-wide, especially for companies that are not training frontier models themselves, but are providing inference compute and training compute for frontier model companies. Profitability happens to be masked by extremely high returns to investment, meaning that in the leading AI companies, we are not seeing that many payouts to shareholders.
Nvidia is making more profit than the operating costs of Anthropic and OpenAI combined, clearly driven by the AI boom. Maybe you can argue that this is fueled by an investment bubble, but clearly profits are flowing extremely aggressively towards AI companies.
One could try to cherry-pick OpenAI or Anthropic here, who are both investing extremely aggressively and so profit margins appear thin, but when you look at the whole industry, it clearly is making huge amounts of profit.
I think settling this is kind of hard, so I feel hesitant to mark this as a totally unambiguous red mark, but it seems pretty close to me. Anyone who would have listened to Gary for the purpose of investment advice, or for the purpose of estimating profit margins for anyone but the companies who are building the next generation of frontier models and have no issue attracting investment to do that, would have been extremely burned (and of course there the market is very much forecasting high future expected returns, that’s why the valuations of those companies are so high).
Ok, so where does this leave us?
7-10 GPT-4 level models: False, since we have many more than 10 GPT-4 level models, by more than 10 distinct companies. But IDK, it’s not like off by many factors (but false as stated). Let’s say a 0.35 on a 0-1 scale to indicate that it is more false than right, but was pointing at something real.
No massive advance: False, unless I am missing some definition that makes this true. o3 was fully finished training and was evaluated by end of 2024.
Price wars: Seems true enough.
Very little moat: IDK, I would resolve ambiguous, though slanted towards true. Let’s say 0.8 or so on a 0-1 scale.
No robust solution to hallucinations: True, unambiguously.
Modest lasting corporate adoption: Extremely false
Modest profits, split 7-10 ways: False, though there are some AI companies that are choosing to re-invest. But we are seeing enough nearby companies make enormous amounts of money (like Nvidia) that this is clearly falsified. Maybe one could give it a 0.1 since the leading labs are still spending more than they are making. Profits are also highly concentrated in a smaller number than 7 players (it’s basically just OpenAI, Anthropic and Google at the model level, and Nvidia at the hardware level, I am not seeing 3-4 other players).
This overall gives me 3.25/7, as a best guess of what a fair evaluation of this specific set of predictions would arrive at. I think one could quibble over 1-2 points, so I think arguing a ⁴⁄₇, or maybe even a ⁵⁄₇ wouldn’t be completely crazy, though I think the latter would be seriously stretching things.
- robo 28 Jun 2025 20:11 UTC
  29 points
  18
  Parent
  Huh, I didn’t expect to take Gary Marcus’s side against yours but I do for almost all of these. If we take your two strongest cases:
  - No massive advance (no GPT-5, or disappointing GPT-5)
    There was no GPT-5 in 2024? And there is still no GPT 5? People were talking in late 2023 like GPT 5 might come out in a few months, and they were wrong. The magic of “everything just gets better with scale” really seemed to slow after GPT-4?
    On reasoning models: I thought of reasoning models happening internally at Anthropic in 2023 and being distilled into public models, which was why Claude was so good at programming. But I could be wrong or have my timelines messed up.
  - Modest lasting corporate adoption
    I’d say this is true? Read e.g. Dwarkesh talking about how he’s pretty AI forward but even he has a lot of trouble getting AIs to do something useful. Many corporations are trying to get AIs to be useful in California, fewer elsewhere, and I’m not convinced these will last.
  I don’t think I really want to argue about these, more I find it weird people can in good faith have such different takes. I remember 2024 as a year I got continuously more bearish on LLM progress^[1].
  1. ^
    Until DeepSeek in late December.
  - habryka 28 Jun 2025 21:13 UTC
    8 points
    −2
    Parent
    There was no GPT-5 in 2025? And there is still no GPT 5? People were talking in late 2023 like GPT 5 might come out in a few months, and they were wrong. The magic of “everything just gets better with scale” really seemed to slow after GPT-4?
    Eh, reasoning models have replaced everything and seem like a bigger deal than GPT-5 to me. Also, I don’t believe you that anyone was talking in late 2023 that GPT-5 was coming out in a few months, that would have been only like 9 months after the release of GPT-4, and the gap between GPT-3 and GPT-4 was almost 3 full years. End of 2024 would have been a quite aggressive prediction even just on reference class forecasting grounds, and IMO still ended up true with the use of o3 (and previously o1, though I think o3 was a big jump on o1 in itself).
    Until DeepSeek in late December.
    I mean, yes, I think the central thing happening in 2024 is the rise of reasoning models. I agree that if we hadn’t seen those, some bearishness would be appropriate, but alas, such did not happen.
    - Thane Ruthenis 29 Jun 2025 0:35 UTC
      36 points
      6
      Parent
      I don’t believe you that anyone was talking in late 2023 that GPT-5 was coming out in a few months
      Out of curiosity, I went to check the prediction markets. Best I’ve found:
      From March 2023 to January 2024, expectations that GPT-5 will come out/be announced in 2023 never rose above 13% and fell to 2-7% in the last three months (one, two, three).
      Based on this series of questions, at the start of 2024, people’s median was September 2024.
      I’d say this mostly confirms your beliefs, yes.
      (Being able to check out the public’s past epistemic states like this is a pretty nifty feature of prediction-market data I haven’t realized before!)
      End of 2024 would have been a quite aggressive prediction even just on reference class forecasting grounds
      76% on “GPT-5 before January 2025” in January 2024, for what it’s worth.
      reasoning models have replaced everything and seem like a bigger deal than GPT-5 to me.
      Ehhh, there are scenarios under which they retroactively turn out not to be a “significant advance” towards AGI. E. g., if it actually proves true that RL training only elicits base models’ capabilities and not creates them; or if they turn out to scale really poorly; or if their ability to generalize to anything but the most straightforward verifiable domains disappoints^[1].
      And I do expect something from this cluster to come true, which would mean that they’re only marginal/no progress towards AGI.
      That said, I am certainly not confident in this, and they are a nontrivial advance by standard industry metrics (if possibly not by the p(doom) metric). And if we benchmark “a significant advance” as “a GPT-3(.5) to GPT-4 jump”, and then tally up all progress over 2024 from GPT-4 Turbo to Sonnet 3.6 and o1/o3^[2], this is probably a comparable advance.^[3]
      I’d count it as “mostly false”. 0-0.2?
      ^
      I don’t think we’ve seen much success there yet? I recall Noam Brown pointing to Deep Research as an example, but I don’t buy that.
      Models have been steadily getting better across the board, but I think it’s just algorithmic progress/data quality + distillation from bigger models, not the reasoning on/off toggle?
      Oh, hm, I guess we can count o3′s lying tendencies as a generalization of its reward-hacking behavior to “soft” domains from math/coding. I am not sure how to count this one, though. I mean, I’d like to make a dunk here, but it does seem to be weak-moderate evidence for the kind of generalization I didn’t want to see.
      ^
      Though I’m given to understand the o3 announced at the end of 2024 and the o3 available now are completely different models, see here and here. So we don’t actually know how 2024!o3 “felt” like, beyond the benchmarks; and so assuming that the modern o3′s capability level was already reached by EOY 2024 is unjustified, I think.
      ^
      This is the point where I would question whether “GPT-3.5 to GPT-4” was a significant advance towards AGI, and drop a hot take that no it wasn’t. But Gary Marcus’ wording implies that GPT-5 would count as a significant advance by his lights, so whatever.
      - habryka 29 Jun 2025 3:17 UTC
        4 points
        0
        Parent
        This all seems pretty reasonable to me. Agree 0.2 seems like a fine call someone could make on this.
    - johnswentworth 29 Jun 2025 16:11 UTC
      32 points
      11
      Parent
      reasoning models [...] seem like a bigger deal than GPT-5 to me.
      Strong disagree. Reasoning models do not make every other trick work better, the way a better foundation model does. (Also I’m somewhat skeptical that reasoning models are actually importantly better at all; for the sorts of things we’ve tried they seem shit in basically the same ways and to roughly the same extent as non-reasoning models. But not sure how cruxy that is.)
      Qualitatively, my own update from OpenAI releasing o1/o3 was (and still is) “Altman realized he couldn’t get a non-disappointing new base model out by December 2024, so he needed something splashy and distracting to keep the investor money fueling his unsustainable spend. So he decided to release the reasoning models, along with the usual talking points of mostly-bullshit evals improving, and hope nobody notices for a while that reasoning models are just not that big a deal in the long run.”
      Also, I don’t believe you that anyone was talking in late 2023 that GPT-5 was coming out in a few months [...] End of 2024 would have been a quite aggressive prediction even just on reference class forecasting grounds
      When David and I were doing some planning in May 2024, we checked the prediction markets, and at that time the median estimate for GPT5 release was at December 2024.
      - gwern 29 Jun 2025 20:49 UTC
        11 points
        0
        Parent
        
        at that time the median estimate for GPT5 release was at December 2024.
        
        Which was correct ex ante, and mostly correct ex post—that’s when OA had been dropping hints about releasing GPT-4.5, which was clearly supposed to have been GPT-5, and seemingly changed their mind near Dec 2024 and spiked it before it seems like the DeepSeek moment in Jan 2025 unchanged their minds and they released it February 2025. (And GPT-4.5 is indeed a lot better than GPT-4 across the board. Just not a reasoning model or dominant over the o1-series.)
        james oofou 29 Jun 2025 21:03 UTC
        13 points
        11
        Parent
        which was clearly supposed to have been GPT-5
        I have seen people say this many times, but I don’t understand. What makes it so clear?
        GPT-4.5 is roughly a 10x scale-up of GPT-4, right? And full number jumps in GPT have always been ~100x? So GPT-4.5 seems like the natural name for OpenAI to go with.
        I do think it’s clear that OpenAI viewed GPT-4.5 as something of a disappointment, I just haven’t seen anything indicating that they at some point planned to break the naming convention in this way.
        gwern 29 Jun 2025 23:09 UTC
        14 points
        1
        Parent
        
        GPT-4.5 is roughly a 10x scale-up of GPT-4, right? And full number jumps in GPT have always been ~100x? So GPT-4.5 seems like the natural name for OpenAI to go with.
        
        10x is what it was, but it wasn’t what it was supposed to be. That’s just what they finally killed it at, after the innumerable bugs and other issues that they alluded to during the livestream and elsewhere, which is expected given the ‘rocket waiting equation’ for large DL runs—after a certain point, no matter how much you have invested, it’s a sunk cost and you’re better off starting afresh, such as, say, with distilled data from some sort of breakthrough model… (Reading between the lines, I suspect that what would become ‘GPT-4.5’ was one of the several still-unknown projects besides Superalignment which suffered from Sam Altman overpromising compute quotas and gaslighting people about it, leading to an endless deathmarch where they kept thinking ‘we’ll get the compute next month’, and the 10x compute-equivalent comes from a mix of what compute they scraped together from failed runs/iterations and what improvements they could wodge in partway even though that is not as good as doing from scratch, see OA Rerun.)
        Lukas Finnveden 30 Jun 2025 18:29 UTC
        2 points
        0
        Parent
        If GPT-4.5 was supposed to be GPT-5, why would Sam Altman underdeliver on compute for it? Surely GPT-5 would have been a top priority?
        Maybe Sam Altman just hoped to get way more compute in total, and then this failed, and OpenAI simply didn’t have enough compute to meet GPT-5′s demands no matter how high of a priority they made it? If so, I would have thought that’s a pretty different story from the situation with superalignment (where my impression was that the complaint was “OpenAI prioritized this too little” rather than “OpenAI overestimated the total compute it would have available, and this was one of many projects that suffered”).
        gwern 30 Jun 2025 21:21 UTC
        2 points
        0
        Parent
        
        If GPT-4.5 was supposed to be GPT-5, why would Sam Altman underdeliver on compute for it? Surely GPT-5 would have been a top priority?
        
        If it’s not obvious at this point why, I would prefer to not go into it here in a shallow superficial way, and refer you to the OA coup discussions.
  - Lech Mazur 29 Jun 2025 9:38 UTC
    1 point
    0
    Parent
    
    There was no GPT-5 in 2025?
    
    There was o1-pro in 2024 (December). It might be argued that this came with caveats due to its slowness and high cost but the difference in science questions (GPQA diamond), math (AIME 2024), competition code (Codeforces) compared to GPT-4 Turbo available at the time of his post was huge. The API wasn’t available in 2024 so we didn’t get any benchmarks besides these from OpenAI. In 2025, I tested o1-pro on NYT Connections and it also improved greatly 1, 2. I would probably also consider regular o1 a massive advancement. I don’t think the naming is what matters.
    
    Many people corporations are trying to get AIs to be useful in California, fewer elsewhere, and I’m not convinced these will last.
    
    Lately, I’ve been searching for potential shorting opportunities in the stock market among companies likely to suffer from AI-first competition. But it’s been tougher than I expected, as nearly every company fitting this description emphasizes their own AI products and AI transformations. Of course, for many of these companies, adapting won’t be quite that easy, but the commitment is clearly there.
    
    The data appears to support this:
    
    “Adoption is deepening, too: The average number of use cases in production doubled between October 2023 and December 2024”—Bain Brief—Survey: Generative AI’s Uptake Is Unprecedented Despite Roadblocks.
    
    “On the firm side, the Chamber of Commerce recorded a 73 percent annualized growth rate between 2023 and 2024.4 The Census BTOS survey shows a 78.4 percent annualized growth rate.5 Lastly, the American Bar Association reported a 38 percent annualized growth rate.6 Among individual-level surveys, Pew is the only source showing changes over time, with an annualized growth rate of 145 percent.7 Overall, these findings suggest that regardless of measurement differences in the levels adoption is rising very rapidly both at the individual and firm-level. ”—Measuring AI Uptake in the Workplace.
- johnswentworth 29 Jun 2025 16:15 UTC
  28 points
  10
  Parent
  We are absolutely, with no ambiguity, in the “most rapid adoptions of any technology in US history branch”. Every single corporation in the world is trying to adopt AI into their products.
  Disagree with your judgement on this one. Agree that everyone is trying to adopt AI into their products, but that’s extremely and importantly different from actual successful adoption. It’s especially importantly different because part of the core value proposition of general AI is that you’re not supposed to need to retool the environment around it in order to use it.
  - habryka 30 Jun 2025 20:56 UTC
    2 points
    0
    Parent
    Agree that the audience is still out on how lasting the adoption will be, but it’s definitely not “modest” (as I mentioned in another thread, it’s plausible to me Gary meant “modest-lasting” adoption instead of “modest and lasting adoption”, i.e. the modest is just modifying the “lasting”, not the “adoption” which was the interpretation I had. I would still take the under on that, but agree it’s less clear cut and would require a different analysis.)
- TsviBT 28 Jun 2025 21:24 UTC
  24 points
  13
  Parent
  Echoing robo’s comment:
  
  Modest lasting corporate adoption
  
  Has there been such adoption? Your remark
  
  Every single corporation in the world is trying to adopt AI into their products. Even extremely slow-moving industries are rushing to adopt AI.
  
  is about attempts to adopt, not lasting adoption. Of course, we can’t make “lasting adoption” mean “adopted for 5 years” if we’re trying to evaluate the prediction right now. But are you saying that there’s lots of adoption that seems probably/plausibly lasting, just by eyeballing it? My vague is impression is no, but I’m curious if the answer is yes or somewhat.
  
  (TBC I don’t have a particularly strong prediction or retrodiction about adoption of AI in general in industry, or LLMs specifically (which is what I think Marcus’s predictions are about). At a guess I’d expect robotics to continue steadily rising in applications; I’d expect LLM use in lots of “grunt information work” contexts; and some niche strong applications like language learning; but not sure what else to expect.)
  - habryka 29 Jun 2025 3:15 UTC
    3 points
    0
    Parent
    I interpreted the statement to mean, “modest, lasting, adoption”. I.e. we will see modest adoption, which will be lasting. It’s plausible Gary meant “modest-lasting adoption” in which case I think there is a better case to be made!
    I still think that case is weak, but of course it’s very hard to evaluate at the end of a year, because how do we know if the adoption is lasting. It seems fine to evaluate that in a year or two and see whether the adoptions that happened in 2024 were lasting. I don’t see any way to call that interpretation already, at least given the current state of evidence.
    - TsviBT 29 Jun 2025 3:54 UTC
      7 points
      8
      Parent
      Well, like, if a company tried out some new robotics thing in one warehouse at a small scale in Q1, then in Q2 and Q3 scaled it up to most of that warehouse, and then in Q4 started work applying the same thing in another warehouse, and announced plans to apply to many warehouses, I think it’d be pretty fair to call this lasting adoption (of robotics, not LLMs, unless the robots use LLMs). On the other hand if they were stuck at the “small scale work trying to make a maybe-scalable PoC”, that doesn’t seem like lasting adoption, yet.
      
      Judging this sort of thing would be a whole bunch of work, but it seems possible to do. (Of course, we can just wait.)
      - habryka 30 Jun 2025 20:57 UTC
        2 points
        0
        Parent
        Agree, though I think, in the world we are in, we don’t happen to have that kind of convenient measurement, or at least not unambiguous ones. I might be wrong, people have come up with clever methodologies to measure things like this in the past that compelled me, but I don’t have an obvious dataset or context in mind where you could get a good answer (but also, to be clear, I haven’t thought that much about it).
- Gary Marcus 29 Jun 2025 16:54 UTC
  17 points
  −15
  Parent
  This in an incredibly uncharitable read, biased and redolent of motivated reasoning.
  - If you applied the same to almost any other set of predictions I think you could nitpick those too. It also lacks context (e.g yes 7-10 was an underestimate, but at time when there were like too and people were surprised that I said such models would become widespread). Even @robo here sees that you have been uncharitable.
  - The one that annoys me the most and makes me not even want to talk about the rest is re GPT-5. Practically everybody thought GPT-5 was imminent; I went out on a limb and said it would not be. I used it as an explicit specific yardstick (which I should be credited for) and that explicit yardstick was not met. Yet you are giving me zero credit.
  - You are just wrong about profits, inventing your own definition. Re adoption, many corporations have TRIED, but proofs of concept are not adoption. there have been loads of articles and surveys written about companies trying stuff out and not getting the ROI they expected.
  I would be happy to respond in more detailed to a serious, balanced investigation that evaluated my predictions over time, going back to my 1998 article on distribution shift, but this ain’t it.
  - ryan_greenblatt 2 Jul 2025 23:07 UTC
    12 points
    8
    Parent
    One lesson you should maybe take away is that if you want your predictions to be robust to different interpretations (including interpretations that you think are uncharitable), it could be worthwhile to try to make them more precise (in the case of a tweet, this could be in a linked blog post which explains in more detail). E.g., in the case of “No massive advance (no GPT-5, or disappointing GPT-5)” you could have said “Within 2024 no AI system will be publicly released which is as much of a qualitative advance over GPT-4 in broad capabilites as GPT-4 is over GPT-3 and where this increase in capabilites appears to be due to scale up in LLM pretraining”. This prediction would have been relatively clearly correct (though I think also relatively uncontroversial at least among people I know as we probably should only have expected to get to ~GPT-4.65 in terms of compute scaling and algorithmic progress by the end of 2024). You could try to operationalize this further in terms of benchmarks or downstream tasks.
    
    To the extent that you can make predictions in terms of concrete numbers or metrics (which is not always possible to be clear), this avoids ~any issues due to interpretation. You could also make predictions about metaculus questions when applicable as these also have relatively solid and well understood resolution criteria.
  - yams 2 Jul 2025 6:22 UTC
    10 points
    3
    Parent
    I think Oliver put in a great effort here, and that the two of you have very different information environments, which results in him reading your points (which are underspecified relative to, e.g., Daniel Kokotajlo’s predictions ) differently than you may have intended them.
    For instance, as someone in a similar environment to Habryka, that there would soon be dozens of GPT-4 level models around was a common belief by mid-2023, based on estimates of the compute used and Nvidia’s manufacturing projections. In your information environment, your 7-10 number looks ambitious, and you want credit for guessing way higher than other people you talked to (and you should in fact demand credit from those who guessed lower!). In our information environment, 7-10 looks conservative. You were directionally correct compared to your peers, but less correct than people I was talking to at the time (and in fact incorrect, since you gave both a lower and upper bound—you’d have just won the points from Oli on that one if you said ‘7+’ and not 7-10’).
    I’m not trying to turn the screw; I think it’s awesome that you’re around here now, and I want to introduce an alternative hypothesis to ‘Oliver is being uncharitable and doing motivated reasoning.’
    Oliver’s detailed breakdown above looks, to me, like an olive branch more than anything (I’m pretty surprised he did it!), and I wish I knew how best to encourage you to see it that way.
    I think it would be cool for you and someone in Habryka’s reference class to quickly come up with predictions for mid-2026, and drill down on any perceived ambiguities, to increase your confidence in another review to be conducted in the near-ish future. There’s something to be gained from us all learning how best to talk to each other.
  - Kaj_Sotala 5 Jul 2025 11:16 UTC
    2 points
    0
    Parent
    I feel the issue with your GPT-5 prediction is that it specifies both “no massive advance” and “no GPT-5″. When there was a massive advance but no GPT-5, it makes it ambiguous which half of the prediction is more important.
    It’s slightly weird to have the correctness of it depend on OpenAI’s branding choices, though. If we decided that the GPT part of the prediction was more important, then in an alternative world that was otherwise identical to our own but where OAI had chosen to call one of their reasoning models GPT-5, the prediction would flip from false to correct. So that makes me lean a bit toward weighting the “no massive advance” part more, though I also wouldn’t think it unreasonable to split the difference and give you half credit for having one part of a two-part prediction correct.
  - tslarm 1 Jul 2025 19:09 UTC
    2 points
    0
    Parent
    I agree with your point about profits; it seems pretty clear that you were not referring to money made by the people selling the shovels.
    But I don’t see the substance in your first two points:
    You chose to give a range with both a lower and an upper bound; the success of the prediction was evaluated accordingly. I don’t see what you have to complain about here.
    In the linked tweet, you didn’t go out on a limb and say GPT-5 wasn’t imminent! You said it either was not imminent or would be disappointing. And you said this in a parenthetical to the claim “No massive advance”. Clearly the success of the prediction “No massive advance (no GPT-5, or disappointing GPT-5)” does not depend solely on the nonexistence of GPT-5; it can be true if GPT-5 arrives but is bad, and it can be false if GPT-5 doesn’t arrive but another “massive advance” does. (If you meant it only to apply to GPT-5, you surely would have just said that: “No GPT-5 or disappointing GPT-5.”)
    Regarding adoption, surely that deserves some fleshing out? Your original prediction was not “corporate adoption has disappointing ROI”; it was “Modest lasting corporate adoption”. The word “lasting” makes this tricky to evaluate, but it’s far from obvious that your prediction was correct.
- MichaelDickens 28 Jun 2025 19:03 UTC
  2 points
  0
  Parent
  I thought price wars was false, although I haven’t been paying that much attention to companies’ pricings. GPT was $20/month in 2023 and it’s still $20/month. IIRC Gemini/Claude were available in 2023 but they only had free tiers so I don’t know how to judge them.
  - gwern 28 Jun 2025 20:10 UTC
    25 points
    10
    Parent
    
    GPT was $20/month in 2023 and it’s still $20/month.
    
    Those are buying wildly different things. (They are not even comparable in terms of real dollars. That’s like a 10% difference, solely from inflation!)
- wunan 28 Jun 2025 23:16 UTC
  1 point
  1
  Parent
  No massive advance (no GPT-5, or disappointing GPT-5)
  Inversion: There was a substantial advance in frontier model AI in 2024.
  Shouldn’t the inversion simply be “There was a massive advance”?
  - habryka 29 Jun 2025 3:19 UTC
    4 points
    0
    Parent
    Sure, edited.
habryka 3 Dec 2025 4:54 UTC
74 points
0
Lightcone is doing another fundraiser this year^[1]! I am still working on our big fundraising post, but figured I would throw up something quick in case people are thinking about their charitable giving today.
Short summary of our funding situation: We are fundraising for $2M this year. Most of that goes into LessWrong and adjacent projects. Lighthaven got pretty close to breaking even this year (though isn’t fully there). We also worked on AI 2027 which of course sure had a lot of effects. We do kind of have to raise around this much if we don’t want to shut down since most of our expenses are fixed costs (my guess is the absolute minimum we could handle is something like $1.4M).
Donors above $2,000 can continue to get things at Lighthaven dedicated to them.
Donate here: https://www.every.org/lightcone-infrastructure
(I also just added a fundraising banner. I expect that to be temporary as I don’t generally like having ad-like content on post pages, but we happen to be getting a very enormous amount of incoming traffic to Claude 4.5 Opus’ Soul Document, and so a banner seemed particularly valuable for a day or two.)
1. ^
  Last year’s fundraiser: https://www.lesswrong.com/posts/5n2ZQcbc7r4R8mvqc/the-lightcone-is-nothing-without-its-people
- Søren Elverlin 3 Dec 2025 17:49 UTC
  7 points
  0
  Parent
  I just donated $2,718 toward turning log-odds into logistics at Lighthaven.
  
  Keep up the good work.
  - habryka 3 Dec 2025 18:27 UTC
    4 points
    0
    Parent
    Thank you!
- Michaël Trazzi 3 Dec 2025 20:38 UTC
  6 points
  0
  Parent
  Any updates on the 2025 numbers for Lighthaven? (cf. this table from last year’s fundraiser)
  - habryka 3 Dec 2025 21:02 UTC
    11 points
    0
    Parent
    I am working on the final financials for it as part the fundraising post! We ended up spending more like $3M instead of $2.6M, but a lot of it is on property tax which we are applying for an exemption for and will get back if the application succeeds, which makes the accounting a bit confusing.
    My current guess is Lighthaven will have ended up netting around -$500k if you include the property tax, and -$200k if you exclude the property tax. But these numbers are quite provisional and could easily change by 50% either way until I’ve done the full math.
habryka 1 Apr 2024 20:06 UTC
72 points
3
Welp, I guess my life is comic sans today. The EA Forum snuck some code into our deployment bundle for my account in-particular, lol: https://github.com/ForumMagnum/ForumMagnum/pull/9042/commits/ad99a147824584ea64b5a1d0f01e3f2aa728f83a
- Ben Pace 1 Apr 2024 23:12 UTC
  17 points
  0
  Parent
  Screenshot for posterity.
- jp 3 Apr 2024 14:17 UTC
  11 points
  5
  Parent
  🙇‍♂️
  - habryka 3 Apr 2024 20:00 UTC
    2 points
    0
    Parent
    😡
- habryka 3 Apr 2024 5:00 UTC
  5 points
  0
  Parent
  And finally, I am freed from this curse.
- winstonBosan 1 Apr 2024 20:40 UTC
  2 points
  1
  Parent
  I hope the partial unveiling of a your user_id hash will not doom us all, somehow.
  - habryka 1 Apr 2024 20:44 UTC
    2 points
    1
    Parent
    You can just get people’s userIds via the API, so it’s nothing private.
habryka 21 Jun 2025 21:46 UTC
71 points
35
Ok, many of y’all can have feelings about whether it’s a good idea to promote Nate’s and Eliezer’s book on the LW frontpage the way we are doing it, but can I get some acknowledgement that the design looks really dope?
Look at those nice semi-transparent post-items. Look at that nice sunset gradient that slowly fades to black. Look at the stars fading out in an animation that is subtle enough that you can (hopefully) ignore it as you scroll down and parse the frontpage, but still creates an airy ominous beauty to live snuffing out across the universe.
Well, I am proud of it :P^[1]
I hope I can do more cool things with the frontpage for other things in the future. I’ve long been wanting to do things with the LW frontpage that create a more aesthetic experience that capture the essence of some important essay or piece of content I want to draw attention to, and I feel like this one worked quite well.
I’ll probably experiment more with some similar things in the future (though I will generally avoid changing the color scheme this drastically unless there is some good reason, and make sure people can easily turn it off from the start).
1. ^
  (Also credit to Ray who did the initial pass of porting over the design from ifanyonebuildsit.com)
- Michaël Trazzi 22 Jun 2025 11:16 UTC
  9 points
  4
  Parent
  I like the design, and think it was worth doing. Regarding making sure “people can easily turn it off from the start” next time, I wanted to offer the datapoint that it took me quite a while to notice the disable button. (It’s black on black, and quite at the edge of the screen, especially if you’re using a horizontal monitor).
- Aprillion 22 Jun 2025 12:51 UTC
  5 points
  0
  Parent
  a) Thanks for this post, I would never have noticed that the design was intended to be quite nice … and I would completely miss the “Earth with ominous red glow” on https://ifanyonebuildsit.com without reading this 😱
  b) I bet you didn’t admire the beautiful design when trying to enjoy your morning coffee with the sun behind your back on a 10yo 4K monitor after a Win11 laptop refused to recognize that it supports 10bpc so it only uses 8bpc, in Firefox (that still doesn’t support gradient dithering), after some jerk forced dark mode on your favorite morning-news-before-work site that usually respects your OS settings (light mode during the day, dark before sleep) and they also removed the usual menu option to toggle it back, so you had to lean around the reflections to spot the subtle X button...
  I can only give you my word that the terrible purple hues from my reading-warm color settings looked nothing like a sunset (and nothing like the intended indigo if I can judge the color from how it looks today on an iPad and/or Android, though they both only have sRGB displays and not showing real indigo hues like a rainbow or summer-night sky), and that I didn’t even notice the subtle scroll animations because I didn’t scroll anything...
  ...but feel free to judge the gradient banding for yourself:
  Hopefully this comment is a useful data point when deciding to “do more cool things with the frontpage for other things in the future” ;)
  - habryka 22 Jun 2025 16:44 UTC
    3 points
    0
    Parent
    I will be honest… that banding looks fine to me. I agree it’s a bit worse, but I don’t think it’s like, destroying the design or anything.
    and they also removed the usual menu option to toggle it back, so you had to lean around the reflections to spot the subtle X button...
    To be clear, we didn’t remove the menu to toggle it back. In the same place where the usual menu is it just has an off-switch:
    (This did go live like 24 hours or so after the banner was put up, so there was a period where you couldn’t turn it off, which was unfortunate)
- Thane Ruthenis 21 Jun 2025 22:34 UTC
  4 points
  8
  Parent
  Ok, many of y’all can have feelings about whether it’s a good idea to promote Nate’s and Eliezer’s book on the LW frontpage the way we are doing it, but can I get some acknowledgement that the design looks really dope?
  The design is dope. I loved the lightcone-eating effect the moment I saw the original version on the book’s website.
- Lucius Bushnaq 24 Jun 2025 7:08 UTC
  3 points
  1
  Parent
  I am glad that you are proud of it and I feel kind of bad saying this, but the reason I had mixed feelings about the promotion is that I just really don’t like the design. I find it visually exhausting to look at. Until you added the option to disable the theme, I was just avoiding the LW front page. I don’t like the design of https://ifanyonebuildsit.com/ either.
- ChristianKl 23 Jun 2025 10:41 UTC
  2 points
  0
  Parent
  The disable special theme button should be bigger, I didn’t see it at first and it was not obvious to me that I could just switch it off with one click.
- Ruby 23 Jun 2025 0:35 UTC
  2 points
  2
  Parent
  It really does look dope
- dirk 22 Jun 2025 2:59 UTC
  2 points
  2
  Parent
  You sure can! I considered posting a shortform to that effect when I first saw it; love the colors and the night-sky effect. It’s nice to have a bit of eye candy on the frontpage :)
- Saul Munn 23 Jun 2025 5:19 UTC
  1 point
  0
  Parent
  yeah, the design is super duper dope. the red fading out the white is a nice touch.
- nowl 23 Jun 2025 0:17 UTC
  1 point
  0
  Parent
  I like the new design a lot, I’d like if I could use it as my default LW theme. I felt this way about the Ghibli one too.
  (This one can also function as a silly reminder not to habitually scroll so far)
- Anders Lindström 22 Jun 2025 16:08 UTC
  −1 points
  −2
  Parent
  The book will probably be a treat to read, but since “someone” (Open AI, Anthropic, Google, META et al.) apparently WILL build “it” everyone WILL die. Oh well, we had a good run I guess.
  
  Design of the frontpage: No. I am fan of the white clean design with the occasional AI generated image. The current iteration reduces readability a lot.
- [ ]
  [deleted]
habryka 26 Dec 2025 22:43 UTC
67 points
16
We have reached the $1,000,000 mark in our fundraiser! Thank you all so much!
habryka 2 Apr 2025 19:15 UTC
65 points
0
Context: LessWrong has been acquired by EA
Goodbye EA. I am sorry we messed up.
EA has decided to not go ahead with their acquisition of LessWrong.
Just before midnight last night, the Lightcone Infrastructure board presented me with information suggesting at least one of our external software contractors has not been consistently candid with the board and me. Today I have learned EA has fully pulled out of the deal.
As soon as EA had sent over their first truckload of cash, we used that money to hire a set of external software contractors, vetted by the most agentic and advanced resume review AI system that we could hack together.
We also used it to launch the biggest prize the rationality community has seen, a true search for the kwisatz haderach of rationality. $1M dollars for the first person to master all twelve virtues.
Unfortunately, it appears that one of the software contractors we hired inserted a backdoor into our code, preventing anyone except themselves and participants excluded from receiving the prize money from collecting the final virtue, “The void”. Some participants even saw themselves winning this virtue, but the backdoor prevented them mastering this final and most crucial rationality virtue at the last possible second.
They then created an alternative account, using their backdoor to master all twelve virtues in seconds. As soon as our fully automated prize systems sent over the money, they cut off all contact.
Right after EA learned of this development, they pulled out of the deal. We immediately removed all code written by the software contractor in question from our codebase. They were honestly extremely productive, and it will probably take us years to make up for this loss. We will also be rolling back any karma changes and reset the vote strength of all votes cast in the last 24 hours, since while we are confident that if our system had worked our karma system would have been greatly improved, the risk of further backdoors and hidden accounts is too big.
We will not be refunding the enormous boatloads of cash^[1] we were making in the sale of Picolightcones, as I am assuming you all read our sale agreement carefully^[2], but do reach out and ask us for a refund if you want.
Thank you all for a great 24 hours though. It was nice while it lasted.
1. ^
  $280! Especially great thanks to the great whale who spent a whole $25.
2. ^
  IMPORTANT Purchasing microtransactions from Lightcone Infrastructure is a high-risk indulgence. It would be wise to view any such purchase from Lightcone Infrastructure in the spirit of a donation, with the understanding that it may be difficult to know what role custom LessWrong themes will play in a post-AGI world. LIGHTCONE PROVIDES ABSOLUTELY NO LONG-TERM GUARANTEES THAT ANY SERVICES SUCH RENDERED WILL LAST LONGER THAN 24 HOURS.
What links here?
- LessWrong has been acquired by EA by habryka (1 Apr 2025 13:09 UTC; 366 points)
habryka 21 Jun 2025 18:22 UTC
63 points
23
Can a reasonable Wikipedia editor take a stab at editing the “Rationalist Community” Wikipedia page into something normal? It appears to be edited by the usual RationalWiki crowd, who have previously been banned from editing Wikipedia articles in the space due to insane levels of bias.
I don’t want to edit myself because of COI, but I am sure there are many people out there who can do a reasonable job. The page currently says inane things like:
Rationalists are concerned with applying Bayesian inference to understand the world as it really is, avoiding cognitive biases, emotionality, or political correctness.
or:
The movement connected to the founder culture of Silicon Valley and its faith in the power of intelligent capitalists and technocrats to create widespread prosperity.[8][9]
Or completely inane things like:
Though this attitude is based on “the view that vile ideas should be countenanced and refuted rather than left to accrue the status of forbidden knowledge”,[19] rationalists also hold the view that other ideas, referred to as information hazards, are dangerous and should be suppressed.[20] Roko’s Basilisk and the writings of Ziz LaSota are commonly cited information hazards among rationalists.[17]
It’s obviously not an article that’s up to Wikipedia’s standards.
If you want some context on the history of the editors in the space: https://www.tracingwoodgrains.com/p/reliable-sources-how-wikipedia-admin
The LessWrong article used to be similarly horrendous, but was eventually transformed into something kind of reasonable (though still not great). Looking through the archived talk pages for that should give a good sense of what kind of policies apply, as well as a bunch of good sources.
- Viliam 22 Jun 2025 22:54 UTC
  8 points
  3
  Parent
  I wonder what is the optimal reaction to situations like that. My first idea is to collectively prepare a response at Less Wrong, which could then be posted on the article talk page. The response would be relatively brief, list the factual errors, and optionally propose suggestions along with references.
  Collectively, because it will be easier for Wikipedia editors to engage with one summary input from our community, rather that several people making partial comments independently. Also, because the quality of the response could be higher if e.g. someone notices an error, someone else finds a reference supporting the complaint, and maybe another person helps to make the entire argument more compatible with the Wikipedia rules.
  Also, someone may be wrong about something, or something can be ambiguous. Like, I keep wondering about the statement that the rationality community formed around LW and SSC. LW, sure. But Scott was posting on LW since 2009, and when he started SSC in 2013, I would say the rationality community had already been formed, albeit much smaller than it is now. SSC as a separate blog actually attracted non-rationalist audience to Scott’s writing, and Scott often posted there things that wouldn’t fit on LW back then, such as jokes and fiction. And even today, I think that only a minority of ACX readers identifies as aspiring rationalists. More often, they make fun of rationalists.
  It took me a while to figure out that “common interests include statistics” probably refers to Bayesianism. At least I think so. Isn’t it weird that I am not sure about one of our most important common interests?
  “CFAR teaches courses based on HPMOR”; I think the causality is probably in the opposite direction.
  “rationalists also hold the view that [...] information hazards, are dangerous and should be suppressed. [...] the writings of Ziz LaSota are commonly cited information hazards among rationalists”; when you put it together like this, that strongly suggests that rationalists believe that Ziz’s blog should be suppressed, but I have never heard such proposal.
  I find it interesting that post-rationalists are described as people who perceive the rationality community as cult-like, when my impression was that original objection was about the community not paying sufficient respect to ancient wisdom, especially religion.
  (Let me guess, this is going to be linked from Wikipedia as “Viliam proposes brigading, be very careful and during the next 100 days revert all changes to the page, and make sure to lock the talk page”.)
  EDIT: I am curious how Wikipedia editors decide who is and who isn’t a member of the rationalist community. For example, Zizians are referred to as rationalists (not “ex-rationalists”), so… once a rationalist, always a rationalist?
  - ProgramCrafter 23 Jun 2025 0:22 UTC
    3 points
    0
    Parent
    EDIT: I am curious how Wikipedia editors decide who is and who isn’t a member of the rationalist community. For example, Zizians are referred to as rationalists (not “ex-rationalists”), so… once a rationalist, always a rationalist?
    I see a subtle distinction there, between “a member of the rationalist community” and “a rationalist”.
    I would say the latter is “someone who has thinking strategies and acting strategies that enable them to have more beneficial and complex things”—or, for more verifiability, “someone whose thinking&acting strategies are worth copying”. Using the second definition, I would not claim nor disclaim being a rationalist because my strategies are mostly [native code] which cannot be copied so easily. In any case, it is not possible to disavow someone being a rationalist because that statement is mostly about them.
    The former, “a member of the rationalist community”, is essentially “someone who keeps in contact with the specific community, exchanges ideas, favors and so on”. That is possible to “excommunicate”.
    - Viliam 23 Jun 2025 13:16 UTC
      3 points
      4
      Parent
      Even more distinctions are possible...
      Thinking/acting style: “mainstream rationality” or “x-rationality”.
      Social behavior: ignores the LW community entirely, reads the website, posts on the website, attends meetups, meet other rationalists even outside meetups, lives in a group house.
      Identity: identifies as a “rationalist”, or just “someone who hangs out with rationalists, but is not one of them”.
      And even this is not clear. Using Zizians as an example, they are clearly inspired by some memes in the LW community, but they also clearly reject some other memes (such as ethical injunctions), are they “x-rationalists” by thinking style? They used to live in the Bay Area and recruit among the rationalists, but they also protested against MIRI and CFAR, were they members of the community at that moment? No idea whether they identified as “rationalists” or whatever else.
      The Zizians are a small, renegade, spin-off group with an ideological emphasis on veganism and anarchism, which became well known in 2025 for being suspected of involvement in four murders. The Zizians originally formed around the Bay Area rationalist community, but became disillusioned with other rationalist organizations and leaders. Among the Zizians’ accusations against them were anti-transgender discrimination, misuse of donor funds to pay off a sexual misconduct accuser, and not valuing animal welfare in plans for human-friendly AI.
      I am actually quite okay with this. It mentions the important things: “spin-off group” (i.e. their membership is a history), “veganism and anarchism” (their motivations other than rationalism). The only way I can imagine it better from my perspective would be to add more years to make it clear that their participation in the community was 2014-2019, and the murders 2022-2025 (i.e. no overlap).
      The part I don’t like is the introduction to the “Zizians” article, which starts with:
      The Zizians are an informal group of rationalists with anarchist and vegan beliefs
      With the word “rationalists” pointing to the “Rationalist community” article. You see the rhetorical trick: anarchism and veganism are their beliefs, but rationalists is what they are. The sentence does not claim explicitly that they are members of the community (as opposed to just someone trying to be more rational), but that’s where the hyperlink points at. Also, the present tense.
      This all is a spin; one could equally validly say e.g. “Zizians are an informal group of anarchist vegans who have met each other during the years they spent in the rationality community.”
- Viliam 22 Jun 2025 21:47 UTC
  8 points
  −8
  Parent
  Ctrl+F TESCREAL … of course it is there. It is a thing that doesn’t even exist, but of course Wikipedia mentions it.
  Oliver, the fact that you even mentioned this is considered “canvassing” (a word I didn’t even know existed) and is apparently against the rules of Wikipedia.
  Wikipedia defines canvassing as notifying other editors of ongoing discussions with the intention of influencing the outcome. It is considered inappropriate, because it compromises the normal consensus making process. The proper ways to do that are:
  - talk page or noticeboard of a related WikiProject (e.g. WikiProject Effective Altruism)
  - central place such as Village Pump
  - if you complain about a specific editor, you can do it on their talk page
  Make sure to be polite, neutral and brief.
  Of course this is now used as an excuse to revert any recent attempts to improve the article.
  I guess the lesson is that the next time you complain about what a horrible mess some Wikipedia article is, you must refrain from explicitly suggesting that anyone improve it. It is important to follow
  - habryka 22 Jun 2025 23:24 UTC
    8 points
    2
    Parent
    I don’t think it counts as canvassing in the relevant sense, as I didn’t express any specific opinion on how the article should be edited. I think maybe you could argue I did vote-stacking, but I think the argument is kind of weak.
    
    Tracing Woodgrain’s post did just successfully fix a bunch of articles. I used to be more hesitant about this, but I think de facto you somehow need to draw attention to when an article needs to be improved, and posting publicly about it is more within the spirit of WP:CANVAS than anything else I actually expect to work (Wikipedia editors with more experience on the issue should raise things however they are supposed to on WP, including posting wherever is appropriate on internal WP boards).
  - Garrett Baker 24 Jun 2025 12:15 UTC
    6 points
    2
    Parent
    
    Of course this is now used as an excuse to revert any recent attempts to improve the article.
    
    From reading the relevant talk-page it is pretty clear those arguing against the changes on these bases aren’t exactly doing so in good faith, and if they did not have this bit of ammunition to use they would use something else, but then with fewer detractors (since clearly nobody else followed or cared about that page).
- lesswronguser123 22 Jun 2025 4:25 UTC
  8 points
  2
  Parent
  I remember editing a abrasive sentence on there few months ago:
  Members of the rationalist community believe only a small number of people, ~~namely~~ including themselves, have the ~~unique abilities~~ knowledge and skill required to reduce the probability of human extinction
  Regardless of the accuracy of this statement^[1] previous characterisation was a bit too unhinged, and was conveying a pompous picture of the rationality community.
  The current version of the page seems to have gone even further on the snark. I had this discussion on bayesian conspiracy discord few months ago, a bunch of people on there thought this would be hard to improve because wikipedia only allows reputable sources and capital R rationality community is worse at rhetoric. Although I am not sure it’s possible there are plethora of news articles—which I didn’t find in my preliminary searches few months ago— praising Rationality community in the mainstream media but due to general negativity bias, and how I only got exposed to strawmans of lesswrong adjacent community when I first found out about it last year—which made me reluctant and paranoid to use this website for months, thanks to @David_Gerard — makes me less optimistic in that as a prior.
  1. ^
    I do think only a small number of people may have the knowledge and skills to do AI alignment.
- don't_wanna_be_stupid_any_more 22 Jun 2025 12:35 UTC
  1 point
  0
  Parent
  I recently noticed just how bad LW’s reputation is outside of the community.
  It is like reading the description of an alternative reality LW made of far right crancks and r/athesim mods.
  Also why does Rational wiki hate LW so much? What is the source of all that animosity?
  - habryka 22 Jun 2025 16:40 UTC
    17 points
    0
    Parent
    Also why does Rational wiki hate LW so much? What is the source of all that animosity?
    David Gerard, one of the founders of RationalWiki, actually used to hang out here a lot. I think a bunch of people gave him negative feedback, he got into a dispute with a moderator, and seems to have walked away with a grudge to try to do everything within his power to destroy LessWrong. See the Tracing Woodgrain’s post for a bunch of the history.
  - MichaelDickens 22 Jun 2025 15:57 UTC
    10 points
    2
    Parent
    
    Also why does Rational wiki hate LW so much? What is the source of all that animosity? Reply
    
    I am not too familiar with RationalWiki but my impression is the editors come from a certain mindset where you always disbelieve anything that sounds weird, and LWers talk about a lot of weird stuff, which to them falls in the same bucket as religion / woo / pseudoscience. And I would think they especially dislike people calling themselves “rationalists” when in actuality they’re just doing woo / pseudoscience.
- samuelshadrach 21 Jun 2025 19:57 UTC
  −19 points
  −6
  Parent
  If you’re AI-pilled enough you can also build fact checking and search functionality on top. o3 can see through the lies. I don’t think most of humanity is going to rely on Wikipedia editors for access to ground truth for very long.
  - samuelshadrach 22 Jun 2025 5:01 UTC
    3 points
    0
    Parent
    @habryka I mean readership of Wikipedia is going to go down if someone builds a better website to replace it. Wikipedia + community-notes-like-voting is an example. So you can build this instead.
  - ProgramCrafter 23 Jun 2025 0:50 UTC
    1 point
    0
    Parent
    o3 can see through the lies
    Can it see through the stereotypes too? From what I saw (though I used Grok for this test and that might be a relevant factor), LLMs are nowhere near a guess that LW might discuss parenting, or interior design, and instead devise more and more specific fields to be intersected with rationality.
    Try again, and now guess #39,#40,#41 topics by discussion amount on LessWrong, and now you are allowed to think explicitly of top thirty eight if you wish so.
    ...
    #39: Rationalist Approaches to Understanding and Managing Complex Systems...
    #40: The Ethics and Implications of Quantum Computing...
    #41: Rationality in Interpersonal Relationships and Communication...
    (the bottom message in https://grok.com/share/bGVnYWN5_1a70945a-10a9-4afb-b501-a8c5d4a0652e)
habryka 24 Mar 2024 19:21 UTC
59 points
22
A thing that I’ve been thinking about for a while has been to somehow make LessWrong into something that could give rise to more personal-wikis and wiki-like content. Gwern’s writing has a very different structure and quality to it than the posts on LW, with the key components being that they get updated regularly and serve as more stable references for some concept, as opposed to a post which is usually anchored in a specific point in time.
We have a pretty good wiki system for our tags, but never really allowed people to just make their personal wiki pages, mostly because there isn’t really any place to find them. We could list the wiki pages you created on your profile, but that doesn’t really seem like it would allocate attention to them successfully.
I was thinking about this more recently as Arbital is going through another round of slowly rotting away (its search currently being broken and this being very hard to fix due to annoying Google Apps Engine restrictions) and thinking about importing all the Arbital content into LessWrong. That might be a natural time to do a final push to enable people to write more wiki-like content on the site.
- gwern 25 Mar 2024 2:24 UTC
  52 points
  11
  Parent
  
  somehow make LessWrong into something that could give rise to more personal-wikis and wiki-like content. Gwern’s writing has a very different structure and quality to it than the posts on LW...We have a pretty good wiki system for our tags, but never really allowed people to just make their personal wiki pages, mostly because there isn’t really any place to find them. We could list the wiki pages you created on your profile, but that doesn’t really seem like it would allocate attention
  
  Multi-version wikis are a hard design problem.
  
  It’s something that people kept trying, when they soured on a regular Wikipedia: “the need for consensus makes it impossible for minority views to get a fair hearing! I’ll go make my own Wikipedia where everyone can have their own version of an entry, so people can see every side! with blackjack & hookers & booze!” And then it becomes a ghost town, just like every other attempt to replace Wikipedia. (And that’s if you’re lucky: if you’re unlucky you turn into Conservapedia or Rational Wiki.) I’m not aware of any cases of ‘non-consensus’ wikis that really succeed—it seems that usually, there’s so little editor activity to go around that having parallel versions winds up producing a whole lotta nothing, and the sub-versions are useless and soon abandoned by the original editor, and then the wiki as a whole fails. (See also: Arbital.) Successful wikis all generally seem to follow the Wikipedia approach of a centralized consensus wiki curated & largely edited by a oligarchy; for example, Wikia fandom wikis will be dominated by a few super-fans, or the Reddit wikis attached to each subreddit will be edited by the subreddit moderators.
  
  (There is also an older model like the original Ward’s Wiki* or the Emacs Wiki where pages might be carefully written & curated, but might also just be dumps of any text anyone felt like including, include editors chatting back and forth, comments appended to excerpts, and ad hoc separators like horizontal rules to denote a change of topic; this sorta worked, but all of them seem to be extremely obscure/small and the approach has faded out almost completely.)
  
  And for LW2, a lot of contributions are intrinsically personally-motivated in a way which seems hard to reconcile with the anonymous-laboring-masses-building-Egyptian-pyramids/Wikipedias. Laboring to create an anonymous consensus account of some idea or concept is not too rewarding. (Recall how few people seriously edit even a titanic success like Wikipedia!) So you have a needle to thread: somehow individualized and personalized, but also somehow community-like...? Is there any site online which manages to hybridize wikis with forums? Not that I know of. (StackOverflow? EN World has “wiki threads” but I dunno how well they work.) It can’t be easy!
  
  Going from ‘fast’ to ‘slow’ is one of, I think, the biggest challenges and dividing lines of Internet community design.
  
  * saturn2 notes a 2010 appenwar essay describing the transition from the original exuberant freewheeling almost-4chan-esque wiki culture, and how one of his companies made heavy use of an internal wiki for everything (where the more comprehensive it got, the more useful it got), to the ‘Wikipedia deletionist’ culture now assumed to be the default for all wikis; and also the failure mode of an ‘internal’ wiki starving the ‘external’ wiki due to friction.
  
  Still, if one wanted to try, I think a better perspective might be ‘personal gardens’ and hypertext transclusions (as now used heavily on Gwern.net). The goal would be to, in essence, try to piggyback a sort of Roam/Notion/PMwiki personal wiki system onto user comments & a site-wide wiki.
  
  One of the rewarding things about a ‘personal garden’ is being able to easily interconnect things and expand them and see it grow; this is something you don’t really get with either LW2 posts or comments—sure, you can hyperlink them, and you can keep editing them and adding to them, but this is not the way they are most easily done (which is ‘fire and forget’). Each one is ultimately trapped in its original context, date-bound. You don’t get that with the standard wiki either, unless you either operate the entire wiki and it’s a personal wiki, or you have de facto control over the set of articles you care about. You can’t write a really personalized wiki article because someone else could always come by and delete it. (One of the reasons I stopped editing English Wikipedia is my sadness at seeing interesting blockquotes, humorous captions, and amusing details constantly being stripped out by humorless narrow-minded deletionists who apparently feel that about every topic, the less said the better.)
  
  So to hybridize this, I would suggest a multi-version wiki model where on each page, there is a ‘consensus’ entry, which acts exactly like the current wiki does and like one expects a normal wiki to act. Anyone can edit it, edits can be reverted by other editors, statements in it are expected to be ‘consensus’ and not simply push minority POVs without qualification, and so on and so forth.
  
  But below the consensus entry (if it exists), there are $USER entries. All $USER entries are transcluded one by one below the main consensus entry. (This will look familiar to any Gwern.net reader.) They are clearly titled & presented as by that user. (They may be sorted by length, most recently modified, or possibly karma; I think I would avoid showing anyone the karma, however, and solely use it for ranking.) Any user can create their own corresponding user entry for any wiki page, which will be transcluded below the consensus entry, and they are the only ones who can edit it; they can write their own entry, criticize the consensus one, list relevant links, muse about the topic, and so on. (I think ‘subpages’ are a convenient syntax here: I don’t know how the current syntax operates, but let’s say that a LW2 wiki entry on ‘Acausal cooperation’ lives at the URL /wiki/Acausal_cooperation; then if I wrote a user entry for it, it would live at /wiki/Acausal_cooperation/gwern.) These entries can be viewed on their own, and given convenient abbreviated shortcuts (why not $USER/article-name as well?); this makes them dead-easy to remember and type. (“Where’s gwern’s wiki-essay on acausal cooperation? oh.”) and include a list of backlinks by that user and then by other users.
  
  Diffs can be displayed on user-profiles similar to comments. User-entries can be displayed in a compact table/block on a user-profile: eg Foo · Bar · Baz · ... line-wrapping for a few lines ought to cover even highly-prolific users for a long time. They could also be unified with Shortform/Quick Takes: each user entry is just a comment there. (Which might help with implementing comments, if a user entry is just a transcluded comment tree. This means that if you want to add to or criticize some part of a user entry, well, you just reply to it right then and there!)
  
  Users might, for the most part, just edit the consensus entry. If they have something spicy to say or they think there’s something wrong with it (like arguing over the definition), they’ll choose to add it to their respective user entry to avoid another editor screwing with it or reverting them. In the best-case scenario, this seduces users into regularly updating and expanding consensus wiki entries as they realize some additional piece doesn’t need to go into their personal entry. Or they might want to make a point of periodically ‘promoting’ their personal entries into the consensus entries, if no one objects.
  
  Users are motivated to create user entries as a way of organizing their own comments and articles across long time periods: every time you link a wiki article (whether consensus or user), you create a backlink which makes re-finding it easier. The implicit tags will be used much more than explicit ones. There is no way to ‘tag’ a comment right now; it’s not that easy to add a tag to your own post; it’s not even that easy to navigate the tags; but it would be easy to look up a wiki article and look at the backlinks.
  
  So one could imagine a regular cycle of writing comments which link to key wiki pages, sometimes accumulating into a regular LW2 post, followed by summarization into a wiki page, refactoring into multiple wiki pages as the issue becomes clearer, and starting over with a better understanding and vocabulary as reflected in the new set of pages which can be linked in comments on each topic...
  
  At no point is the user committed to some of Big Deal website or Essay, like they think they’re a big shot who’s earned some sort of hero license to write online—“creating your own public wiki? my, that’s quite some self-esteem there; remind me again what your degree was in and where you have tenure?” Like tweeting, it all just happens one little step at a time: link a wiki entry in one comment, another elsewhere, leave a quick little clarification or neat tangent in your user entry in an obscure page, get a little more argumentative with another consensus entry you think is mistaken, go so far as to create a new page on a technical term you think would be useful...
  
  Technically-wise, I think this shouldn’t require too much violence to the existing codebase as a wiki is already implemented & running, LW2 already supports some degree of transclusion (both server & client-side), sub-pages are usually feasible in wikis and just requires access controls added on to match user==page-name, and backlinks should already be provided by the wiki and automatic for sub-pages as well. The difficulty is making it seamless and friction-free and intuitive.
  - habryka 25 Mar 2024 6:11 UTC
    7 points
    4
    Parent
    So, the key difficulty this feels to me like its eliding is the ontology problem. One thing that feels cool about personal wikis is that people come up with their own factorization and ontology for the things they are thinking about. Like, we probably won’t have a consensus article on the exact ways L in Death Note made mistakes, but Gwern.net would be sadder without that kind of content.
    So I think in addition to the above there needs to be a way for users to easily and without friction add a personal article for some concept they care about, and to have a consistent link to it, in a way that doesn’t destroy any of the benefits of the collaborative editing.
    My sense is that collaboratively edited wikis tend to thrive heavily around places where there is a clear ontology and decay when the ontology is unclear or the domain permits many orthogonal carvings. This makes video game wikis so common and usually successful, as via the nature of their programming they will almost always have a clear structure to them (the developer probably coded an abstraction for “enemies” and “attack patterns” and “levels” and so the wiki can easily mirror them and document them).
    It feels to me that anything that wants to somehow build a unification of personal wikis and consensus wikis needs to figure out how to gracefully handle the ontology problem.
    - gwern 25 Mar 2024 14:07 UTC
      12 points
      3
      Parent
      
      One thing that feels cool about personal wikis is that people come up with their own factorization and ontology for the things they are thinking about...So I think in addition to the above there needs to be a way for users to easily and without friction add a personal article for some concept they care about, and to have a consistent link to it, in a way that doesn’t destroy any of the benefits of the collaborative editing.
      
      My proposal already provides a way to easily add a personal article with a consistent link, while preserving the ability to do collaborative editing on ‘public’ articles. Strictly speaking, it’s fine for people to add wiki entries for their own factorization and ontology.
      
      There is no requirement for those to all be ‘official’: there doesn’t have to be a ‘consensus’ entry. Nothing about a /wiki/Acausal_cooperation/gwern user entry requires the /wiki/Acausal_cooperation consensus entry to exist. (Computers are flexible like that.) That just means there’s nothing there at that exact URL, or probably better, it falls back to displaying all sub-pages of user entries like usual. (User entries presumably get some sort of visual styling, in the same way that comments on a post look different from a post, which in addition to the title/author metadata displayed, avoids confusion.)
      
      If, say, TurnTrout wants to create a user entry /wiki/Reward_is_not_the_optimization_target/TurnTrout as a master key to all of his posts & comments and related ones like Nora Belrose’s posts, rather than go for a consensus entry, that’s fine. And if it becomes commonly-accepted jargon and part of the ontology, and so it becomes a hassle that people can’t edit his user entry (rather than leave their own user entry or comments), his user entry can be ‘promoted’ ie. just be copied by someone into a new consensus entry at /wiki/Reward_is_not_the_optimization_target that can then be edited and his user entry left as historical or possibly collapsed/hidden by an admin for readability.
      
      (The growth of the ad hoc user & consensus ontology might be a bit messy and sometimes it might be a good idea to delete/edit user entries by users who are gone or uncooperative or edit their entries to update links, but that’s little different from a regular wiki’s need for admins to do similar maintenance.)
      
      Like, we probably won’t have a consensus article on the exact ways L in Death Note made mistakes, but Gwern.net would be sadder without that kind of content.
      
      The DN essay mostly would not make sense as a wiki entry, and indeed, it’s been ‘done’ ever since 2013 or so. There’s not much more to be said about the topic (despite occasional attempts at criticism, which typically just wind up repeating something I already said in it). It doesn’t need wiki support or treatment, and it was a post-like essay: I wrote it up as a single definitive piece and it was discussed at a particular time and place. (Actually, I think I did originally post it on LW1?) It benefits from ongoing Gwern.net improvements, but mostly in a general typographical sense rather than being deeply integrated into other pages.
      
      The parts of it that keep changing do have wiki-like nature:
      
      For example, the parts from Jaynes would make sense as part of a ‘Legal Bayesianism’ article, which would be useful to invoke in many other posts like debates on Amanda Knox’s innocence.
      
      The basic concept of ‘information’ from information theory as whatever lets you narrow down a haystack into a needle (which is the ‘big idea’ of the essay—teaching you information theory by example, by dramatizing the hunt for a criminal leaking circumstantial evidence) is certainly a wiki-worthy topic that people could benefit from naming explicitly and appending their own discussions or views on.
      
      This could come up in many places, from looking for aliens to program search in DL scaling or AIXI paradigm or thinking about Bayesian reasoning (eg. Eliezer on how many bits of information it takes to make a hypothesis ‘live’ at all).
      
      Or that big list of side channels / deanonymization methods would make complete sense as a wiki entry which people could contribute piquant instances to, and would be useful linking elsewhere on LW2, particularly in articles on AGI security and why successfully permanently boxing a malevolent, motivated superintelligence would be extremely difficult—because there are bazillions of known side-channels which enable exfiltration/control/inference & we keep discovering new ones like how to turn computer chips into radios/speakers or entire families of attacks like SPECTRE or Rowhammer.
      
      (One reason I’ve invested so much effort into the tag-directory system is the hope of replacing such churning lists with transcludes of tags. The two major examples right now are https://gwern.net/dnm-archive#works-using-this-dataset and https://gwern.net/danbooru2021#publications—I want to track all users/citers of my datasets to establish their value for researchers & publicize those uses, but adding them manually was constant toil and increasingly redundant with the annotations. So with appropriate tooling, I switch to transcluding a tag for the citers instead. Any time a new user shows up, I just write an annotation for it, as I would have before, and add a dnm-archive or danbooru tag to it and then they show up automatically with no further work. So you could imagine doing the same thing in my DN essay: instead of that long unordered list, which is tedious to update every time there’s a fun security paper or blog post, I would instead have a tag like cs/security/side-channel where each of those is annotated, and simply transclude the table of contents for that. If I still wanted a natural-language summary similar to the existing list, well, I could just stick that at the top of the tag and benefit every instance of the tag.)
    - Chris_Leong 25 Mar 2024 8:20 UTC
      2 points
      0
      Parent
      Users can just create pages corresponding to their own categories
      Like Notion we could allow two-way links between pages so users would just tag the category in their own custom inclusions.
  - Chris_Leong 25 Mar 2024 8:24 UTC
    2 points
    0
    Parent
    I agree with Gwern. I think it’s fairly rare that someone wants to write the whole entry themselves or articles for all concepts in a topic.
    
    It’s much more likely that someone just wants to add their own idiosyncratic takes on a topic. For example, I’d love to have a convenient way to write up my own idiosyncratic takes on decision theory. I tried including some of these in the main Wiki, but it (understandably) was reverted.
    
    I expect that one of the main advantages of this style of content would be that you can just write a note without having to bother with an introduction or conclusion.
    I also think it would be fairly important (though not at the start) to have a way of upweighting the notes added by particular users.
    
    I agree with Gwern that this may result in more content being added to the main wiki pages when other users are in favour of this.
- Seth Herd 26 Mar 2024 20:31 UTC
  5 points
  2
  Parent
  TLDR: The only thing I’d add to Gwern’s proposal is making sure there are good mechanisms to discuss changes. Improving the wiki and focusing on it could really improve alignment research overall.
  
  Using the LW wiki more as a medium for collaborative research could be really useful in bringing new alignment thinkers up to speed rapidly. I think this is an important part of the overall project; alignment is seeing a burst of interest, and being able to rapidly make use of bright new minds who want to donate their time to the project might very well make the difference in adequately solving alignment in time.
  
  As it stands, someone new to the field has to hunt for good articles on any topic, and they provide some links to other important articles, but that’s not really their job. The wiki’s tags does serve that purpose. The articles are sometimes a good overview of that concept or topic, but more community focus on the wiki could make them work much better as a way
  
  Ideally each article aims to be a summary of current thinking on that topic, including both majority and minority views. One key element is making this project build community rather than strain it. Having people with different views work well collaboratively is a bit tricky. Good mechanisms for discussion are one way to reduce friction and any trend toward harsh feelings when ones’ contributions are changed. The existing comment system might be adequate, particularly with more of a norm of linking changes to comments, and linking to comments from the main text for commentary.
- Dagon 24 Mar 2024 23:28 UTC
  5 points
  3
  Parent
  Do you have an underlying mission statement or goal that can guide decisions like this? IMO, there are plenty of things that should probably continue to live elsewhere, with some amount of linking and overlap when they’re lesswrong-appropriate.
  One big question in my mind is “should LessWrong use a different karma/voting system for such content?”. If the answer is yes, I’d put a pretty high bar for diluting LessWrong with it, and it would take a lot of thought to figure out the right way to grade “wanted on LW” for wiki-like articles that aren’t collections/pointers to posts.
- niplav 24 Mar 2024 23:59 UTC
  3 points
  0
  Parent
  One small idea: Have the ability to re-publish posts to allPosts or the front page after editing. This worked in the past, but now doesn’t anymore (as I noticed recently when updating this post).
  - habryka 25 Mar 2024 0:38 UTC
    5 points
    0
    Parent
    Yeah, the EA Forum team removed that functionality (because people kept triggering it accidentally). I think that was a mild mistake, so I might revert it for LW.
- Chris_Leong 25 Mar 2024 8:17 UTC
  2 points
  0
  Parent
  Cool idea, but before doing this one obvious inclusion would be to make it easier to tag LW articles, particularly your own articles, in posts by @including them.
habryka 3 May 2024 22:55 UTC
53 points
24
Does anyone have any takes on the two Boeing whistleblowers who died under somewhat suspicious circumstances? I haven’t followed this in detail, and my guess is it is basically just random chance, but it sure would be a huge deal if a publicly traded company now was performing assassinations of U.S. citizens.
Curious whether anyone has looked into this, or has thought much about baseline risk of assassinations or other forms of violence from economic actors.
- habryka 3 May 2024 23:03 UTC
  7 points
  0
  Parent
  @jefftk comments on the HN thread on this:
  How many people would, if they suddenly died, be reported as a “Boeing whistleblower”? The lower this number is, the more surprising the death.
  Another HN commenter says (in a different thread):
  It’s a nice little math problem.
  Let’s say both of the whistleblowers were age 50. The probability of a 50 year old man dying in a year is 0.6%. So the probability of 2 or more of them dying in a year is 1 - (the probability of exactly zero dying in a year + the probability of exactly one dying in a year). 1 - (A+B).
  A is (1-0.006)^N. B is 0.006N(1-0.006)^(N-1). At 60 A is about 70% and B is about 25% making it statistically insignificant.
  But they died in the same 2 month period, so that 0.006 should be 0.001. If you rerun the same calculation, it’s 356.
  - Ben Pace 3 May 2024 23:35 UTC
    14 points
    5
    Parent
    I’m probably missing something simple, but what is 356? I was expecting a probability or a percent, but that number is neither.
    - elifland 4 May 2024 1:02 UTC
      20 points
      8
      Parent
      I think 356 or more people in the population needed to make there be a >5% of 2+ deaths in a 2 month span from that population
      - isabel 5 May 2024 0:40 UTC
        16 points
        8
        Parent
        I think there should be some sort of adjustment for Boeing not being exceptionally sus before the first whistleblower death—shouldn’t privilege Boeing until after the first death, should be thinking across all industries big enough that the news would report on the deaths of whistleblowers. which I think makes it not significant again.
      - aphyer 5 May 2024 13:02 UTC
        2 points
        0
        Parent
        Shouldn’t that be counting the number squared rather than the number?
  - Seth Herd 10 May 2024 22:56 UTC
    0 points
    −2
    Parent
    Ummm, wasn’t one of them just about to testify against Boeing in court, on their safety practices? And they “committed suicide” after saying the day before how much they were looking forward to finally getting a hearing on their side of the story? That’s what I read; I stopped at that point, thinking “about zero chance that wasn’t murder”.
    - habryka 10 May 2024 22:58 UTC
      5 points
      3
      Parent
      I think the priors here are very low, so while I agree it looks suspicious, I don’t think it’s remotely suspicious enough to have the correct posterior be “about zero chance that wasn’t murder”. Corporations, at least in the U.S. really very rarely murder people.
      - Seth Herd 10 May 2024 23:07 UTC
        1 point
        −2
        Parent
        That’s true, but the timing and incongruity of a “suicide” the day before testifying seems even more absurdly unlikely than corporations starting to murder people. And it’s not like they’re going out and doing it themselves; they’d be hiring a hitman of some sort. I don’t know how any of that works, and I agree that it’s hard to imagine anyone invested enough in their job or their stock options to risk a murder charge; but they may feel that their chances of avoiding charges are near 100%, so it might make sense to them.
        
        I just have absolutely no other way to explain the story I read (sorry I didn’t get the link since this has nothing to do with AI safety) other than that story being mostly fabricated. People don’t say “finally tomorrow is my day” in the evening and then put a gun in their mouth the next morning without being forced to do it. Ever. No matter how suicidal, you’re sticking around one day to tell your story and get your revenge.
        
        The odds are so much lower than somebody thinking they could hire a hit and get away with it, and make a massive profit on their stock options. They could well also have a personal vendetta against the whistleblower as well as the monetary profit. People are motivated by money and revenge, and they’re prone to misestimating the odds of getting caught. They could even be right that in their case it’s near zero.
        
        So I’m personally putting it at maybe 90% chance of murder.
- ChristianKl 6 May 2024 21:18 UTC
  4 points
  0
  Parent
  Poisoning someone with MRSA infection seems possible but if that’s what happened it’s capabilities that are not easily available. If such a thing would happen in another case, people would likely speak about nation-state capabilities.
- Nathan Young 7 May 2024 9:34 UTC
  2 points
  0
  Parent
  I find this a very suspect detail, though the base rate of cospiracies is very low.
  “He wasn’t concerned about safety because I asked him,” Jennifer said. “I said, ‘Aren’t you scared?’ And he said, ‘No, I ain’t scared, but if anything happens to me, it’s not suicide.’”
  https://abcnews4.com/news/local/if-anything-happens-its-not-suicide-boeing-whistleblowers-prediction-before-death-south-carolina-abc-news-4-2024
habryka 14 Dec 2024 1:37 UTC
52 points
2
I have updated the OpenAI Email Archives to now also include all emails that OpenAI has published in their March 2024 and December 2024 blogposts!
I continue to think reading through these is quite valuable, and even more interesting with the March 2024 and December 2024 emails included.
- Kei Nishimura-Gasparian 14 Dec 2024 5:48 UTC
  9 points
  0
  Parent
  I think you flipped the names from the iMessage conversation. As per the caption in the OpenAI blog post, the blue bubbles are for Altman and the grey bubbles are for Zilis.
  - habryka 14 Dec 2024 6:30 UTC
    5 points
    0
    Parent
    You are correct. Seems like I got confused. Obvious in retrospect. Thank you for catching the error!
habryka 27 Sep 2024 1:23 UTC
51 points
7
AND THE GAME IS CLEAR. WRONGANITY SHALL SURVIVE ANOTHER DAY. GLORY TO EAST WRONG. GLORY TO WEST WRONG. GLORY TO ALL.
habryka 23 Mar 2024 4:57 UTC
46 points
7
Btw less.online is happening. LW post and frontpage banner probably going up Sunday or early next week.
habryka 28 Apr 2019 0:02 UTC
46 points
5
Thoughts on voting as approve/disapprove and agree/disagree:
One of the things that I am most uncomfortable with in the current LessWrong voting system is how often I feel conflicted between upvoting something because I want to encourage the author to write more comments like it, and downvoting something because I think the argument that the author makes is importantly flawed and I don’t want other readers to walk away with a misunderstanding about the world.
I think this effect quite strongly limits certain forms of intellectual diversity on LessWrong, because many people will only upvote your comment if they agree with it, and downvote comments they disagree with, and this means that arguments supporting people’s existing conclusions have a strong advantage in the current karma system. Whereas the most valuable comments are likely ones that challenge existing beliefs and that are rigorously arguing for unpopular positions.
A feature that has been suggested many times over the years is to split voting into two dimensions. One dimension being “agree/disagree” and the other being “approve/disapprove”. Only the “approve/disapprove” dimension matters for karma and sorting, but both are displayed relatively prominently on the comment (the agree/disagree dimension on the the bottom, the approve/disapprove dimension at the top). I think this has some valuable things going for it, and in particular would make me likely to upvote more comments because I could simultaneously signal that while I think a comment was good, I don’t agree with it.
An alternative way of doing this that Ray has talked about is the introduction of short reactions that users can click at the bottom of a comment, two of the most prominently displayed ones would be “agree/disagree”. Reactions would be by default non-anonymous and so would serve more as a form of shorthand comment instead of an alternative voting system. Here is an example of how that kind of UI might look:

I don’t know precisely what the selection menu for choosing reactions should look like. My guess is we want to have a relatively broad selection, maybe even with the ability to type something custom into it (obviously limiting the character count significantly).
I am most worried that this will drastically increase the clutter of comment threads and make things a lot harder to parse. In particular if the order of the reacts is different on each comment, since then there is no reliable way of scanning for the different kinds of information.
A way to improve on this might be by having small icons for the most frequent reacts, but that then introduces a pretty sharp learning curve into the site, and it’s always a pain to find icons for really abstract concepts like “agree/disagree”.
I think I am currently coming around to the idea of reactions being a good way to handle approve/disapprove, but also think it might make more sense to introduce more as a new kind of vote that has more top-level support than simple reacts would have. Though in the most likely case this whole dimension will turn out to be too complicated and not worth the complexity costs (as 90% of feature ideas do).
What links here?
- mako yass 1 May 2019 7:40 UTC
  17 points
  0
  Parent
  Having a reaction for “changed my view” would be very nice.
  Features like custom reactions gives me this feeling that.. language will emerge from allowing people to create reactions that will be hard to anticipate but, in retrospect, crucial. Playing a similar role that body language plays during conversation, but designed, defined, explicit.
  If someone did want to introduce the delta through this system, it might be necessary to give the coiner of a reaction some way of linking an extended description. In casual exchanges.. I’ve found myself reaching for an expression that means “shifted my views in some significant lasting way” that’s kind of hard to explain in precise terms, and probably impossible to reduce to one or two words, but it feels like a crucial thing to measure. In my description, I would explain that a lot of dialogue has no lasting impact on its participants, it is just two people trying to better understand where they already are. When something really impactful is said, I think we need to establish a habit of noticing and recognising that.
  But I don’t know. Maybe that’s not the reaction type that what will justify the feature. Maybe it will be something we can’t think of now.
  Generally, it seems useful to be able to take reduced measurements of the mental states of the readers.
  - Said Achmiz 1 May 2019 10:23 UTC
    12 points
    0
    Parent
    
    the language that will emerge from allowing people to create reactions that will be hard to anticipate but, in retrospect, crucial
    
    This is essentially the concept of a folksonomy, and I agree that it is potentially both applicable here and quite important.
- Rob Bensinger 28 Apr 2019 3:40 UTC
  5 points
  0
  Parent
  I am most worried that this will drastically increase the clutter of comment threads and make things a lot harder to parse. In particular if the order of the reacts is different on each comment, since then there is no reliable way of scanning for the different kinds of information.
  I like the reactions UI above, partly because separating it from karma makes it clearer that it’s not changing how comments get sorted, and partly because I do want ‘agree’/‘disagree’ to be non-anonymous by default (unlike normal karma).
  I agree that the order of reacts should always be the same. I also think every comment/post should display all the reacts (even just to say ‘0 Agree, 0 Disagree...‘) to keep things uniform. That means I think there should only be a few permitted reacts—maybe start with just ‘Agree’ and ‘Disagree’, then wait 6+ months and see if users are especially clambering for something extra.
  I think the obvious other reacts I’d want to use sometimes are ‘agree and downvote’ + ‘disagree and upvote’ (maybe shorten to Agree+Down, Disagree+Up), since otherwise someone might not realize that one and the same person is doing both, which loses a fair amount of this thing I want to be fluidly able to signal. (I don’t think there’s much value to clearly signaling that the same person agreed and upvoted or disagree and downvoted a thing.)
  I would also sometimes click both the ‘agree’ and ‘disagree’ buttons, which I think is fine to allow under this UI. :)
- Said Achmiz 28 Apr 2019 3:28 UTC
  2 points
  0
  Parent
  Why not Slashdot-style?
  - habryka 28 Apr 2019 6:11 UTC
    5 points
    0
    Parent
    Slashdot has tags, but each tag still comes with a vote. In the above, the goal would be explicitly to allow for the combination of “upvoted though I still disagree” which I don’t think would work straightforwardly with the slashdot system.
    I also find it it quite hard to skim for anything on Slashdot, including the tags (and the vast majority of users at any given time can’t add reactions on slashdot at any given time, so there isn’t much UI for it).
habryka 29 Oct 2024 6:07 UTC
43 points
2
After many years of pain, LessWrong now has fixed kerning and a consistent sans-serif font on all operating systems. You have probably seen terrible kerning like this over the last few years on LW:
It really really looks like there is no space between the first comma and “Ash”. This is because Apple has been shipping an extremely outdated version of Gill Sans with terribly broken kerning, often basically stripping spaces completely. We have gotten many complaints about this over the years.
But it is now finally fixed. However, changing fonts likely has many downstream effects on various layout things being broken in small ways. If you see any buttons or text misaligned, let us know, and we’ll fix it. We already cleaned up a lot, but I am expecting a long tail of small fixes.
- MondSemmel 29 Oct 2024 12:05 UTC
  14 points
  26
  Parent
  I don’t know what specific change is responsible, but ever since that change, for me the comments are now genuinely uncomfortable to read.
- cubefox 29 Oct 2024 9:54 UTC
  13 points
  15
  Parent
  Did the font size in comments change? It does seem quite small now...
  - Kaj_Sotala 29 Oct 2024 10:22 UTC
    13 points
    19
    Parent
    Yeah it feels uncomfortably small to read to me now
    - Viliam 29 Oct 2024 11:26 UTC
      8 points
      12
      Parent
      Something felt uncomfortable today, but I can’t put my finger on it. Just a general feeling as if the letters are less sharp or less clearly separated or something like that.
      - habryka 29 Oct 2024 15:37 UTC
        3 points
        0
        Parent
        Guys, for this specific case you really have to say what OS you are using. Otherwise you might be totally talking past each other.
        
        (Font-size didn’t change on any OS, but the font itself changed from Calibri to Gill Sans on Windows. Gill Sans has a slightly smaller x-height so probably looks a bit smaller.)
        Kaj_Sotala 29 Oct 2024 19:47 UTC
        8 points
        10
        Parent
        On Windows the font feels actively unpleasant right away, on Android it’s not quite as bad but feels like I might develop eyestrain if I read comments for a longer time.
        MondSemmel 29 Oct 2024 19:09 UTC
        6 points
        4
        Parent
        Up to a few days ago, the comments looked good on desktop Firefox, Windows 11, zoom level 150%. Now I find them uncomfortable to look at.
        habryka 29 Oct 2024 20:54 UTC
        2 points
        0
        Parent
        Plausible we might want to revert to Calibri on Windows, but I would like to make Gill Sans work. Having different font metrics on different devices makes a lot of detailed layout work much more annoying.
        Curious if you can say more about the nature of discomfort. Also curious whether fellow font optimizer @Said Achmiz has any takes, since he has been helpful here in the past, especially on the “making things render well on Windows” side.
        Said Achmiz 30 Oct 2024 6:34 UTC
        16 points
        0
        Parent
        Well, let’s see. Calibri is a humanist sans; Gill Sans is technically also humanist, but more more geometric in design. Geometric sans fonts tend to be less readable when used for body text.
        
        Gill Sans has a lower x-height than Calibri. That (obviously) is the cause of all the “the new font looks smaller” comments.
        
        (A side-by-side comparison of the fonts, for anyone curious, although note that this is Gill Sans MT Pro, not Gill Sans Nova, so the weight [i.e., stroke thickness] will be a bit different than the version that LW now uses.)
        
        Now, as far as font rendering goes… I just looked at the site on my Windows box (adjusting the font stack CSS value to see Gill Sans Nova again, since I see you guys tweaked it to give Calibri priority)… yikes. Yeah, that’s not rendering well at all. Definitely more blurry than Calibri. Maybe something to do with the hinting, I don’t know. (Not really surprising, since Calibri was designed from the beginning to look good on Windows.) And I’ve got a hi-DPI monitor on my Windows machine…
        
        Interestingly, the older version of Gill Sans (seen in the demo on my wiki, linked above) doesn’t have this problem; it renders crisply on Windows. (Note that this is not the flawed, broken-kerning version of the font that comes with Macs!)
        
        I also notice that the comment font size is set to… 15.08px. Seems weird? Bumping it up to 16px improves things a bit, although it’s still not amazing.
        
        If you can switch to the older (but not broken) version of Gill Sans, that’d be my recommendation.
        
        If you can’t… then one option might be to check out one of the many similar fonts to see if perhaps one of them renders better on Windows while still having matching metrics.
        kave 31 Oct 2024 1:11 UTC
        3 points
        0
        Parent
        One sad thing about older versions of Gill Sans: Il1 all look the same. Nova at least distinguishes the 1.
        IMO, we should probably move towards system fonts, though I would like to choose something that preserves character a little more.
        habryka 30 Oct 2024 15:45 UTC
        2 points
        0
        Parent
        Interesting, thanks! Checking an older version of Gill Sans probably wouldn’t have been something would have thought to do, so your help is greatly appreciated.
        I’ll experiment some with getting Gill Sans MT Pro.
        MondSemmel 29 Oct 2024 21:23 UTC
        7 points
        0
        Parent
        Comparing with this Internet Archive snapshot from Oct 6, both at 150% zoom, both in desktop Firefox in Windows 11: Comparison screenshot, annotated
        The new font seems… thicker, somehow? There’s a kind of eye test you do at the optician where they ask you if the letters seem sharper or just thicker (or something), and this font reminds me of that. Like something is wrong with the prescription of my glasses.
        The new font also feels noticeably smaller in some way. Maybe it’s the letter height? I lack the vocabulary to properly describe this. At the very least, the question mark looks noticeably weird. And e.g. in “t” and “p”, the upper and lower parts of the respective letter are weirdly tiny.
        Incidentally there were also some other differences in the shape and alignment of UI elements (see the annotated screenshot).
        MondSemmel 29 Oct 2024 21:30 UTC
        4 points
        0
        Parent
        Oh, and the hover tooltip for the agreement votes is now bugged; IIRC hovering over the agreement vote number is supposed to give you some extra info just like with karma, but now it just explains what agreement votes are.
        cubefox 29 Oct 2024 17:59 UTC
        6 points
        2
        Parent
        
        Guys, for this specific case you really have to say what OS you are using. Otherwise you might be totally talking past each other.
        
        (Font-size didn’t change on any OS, but the font itself changed from Calibri to Gill Sans on Windows. Gill Sans has a slightly smaller x-height so probably looks a bit smaller.)
        
        I tested it on Android, it’s the same for both Firefox and Chrome. The font looks significantly smaller than the old font, likely due to the smaller x-height you mentioned. Could the font size of the comments be increased a bit so that it appears visually about as large as the old one? Currently I find it too small to read comfortably. (Subjective font size is often different from the standard font size measure. E.g. Verdana appears a lot larger than Arial at the same standard “size”.)
        
        (A general note: some people are short sighted and wear glasses, and the more short-sighted you are, the stronger the glasses contract your field of view to a smaller area. So things that may appear as an acceptable size for people who aren’t particularly short-sighted, may appear too small for more short-sighted people.)
        Nathan Helm-Burger 29 Oct 2024 17:09 UTC
        6 points
        9
        Parent
        Yeah, using Firefox on both Android and Windows. Font looks terrible on the comments. Too small, and the the letters are too smushed together. I was going to just change it on the client-side, but then noticed other people complaining.
        Couldn’t you please just set the comment font to the same as the post font? I would vastly prefer to have it all the same.
        habryka 29 Oct 2024 17:21 UTC
        6 points
        3
        Parent
        You definitely would not want the comment font be the same as the post font. Legibility would be really terrible for that serif font at the small font-size that you want to display comments as. I am confident it would be much worse for the vast majority of users (feel free to try it yourself). You could change both post font and comment font to a sans-serif, but that would get rid of a lot of the character of the site (and I prefer legibility of serif fonts at larger font sizes).
        Vladimir_Nesov 29 Oct 2024 23:23 UTC
        16 points
        10
        Parent
        
        would not want the comment font be the same as the post font [...] the small font-size that you want to display comments as
        
        I had to increase the zoom level by about 20% (from 110% to 130%) after this change to make the comments readable^[1]. This made post text too big to the point where I would normally adjust zoom level downward, but I can’t in this case^[2], since the comments are on the same site as the posts. Also the lines in both posts and comments are now too long (with greater zoom).
        
        I sit closer to the monitor than standard to avoid need for glasses^[3], so long lines have higher angular distance. In practice modern sites usually have a sufficiently narrow column of text in the middle so this is almost never a problem. Before the update, LW line lengths were OK (at 110% zoom). At monitor/window width 1920px, substack’s 728px seems fine (at default zoom), but LW’s 682px get balooned too wide with 130% zoom.
        
        The point is not that accomodating sitting closer to the monitor is an important use case for a site’s designer, but that somehow the convergent design of most of the web manages to pass this test, so there might be more reasons for that.
        
        Incidentally, the footnote font size is 12.21px, even smaller than the comment font size of 15.08px.
        
        ↩︎
        The comment font still doesn’t feel “sharp”, like there’s more anti-aliasing at work. It’s Gill Sans Nova Medium, size 15.08px (130% zoom applies on top of that). OpenSans Regular 18px on RoyalRoad (100% zoom; as an example sans font) doesn’t have this problem. LW post text is fine (at either zoom), Warnock Pro 18.2px. I’m in Firefox on Arch Linux, 1920x1080.
        Here’s a zoomed-in screenshot from LW (from 130% zoom in Firefox):
        
        Here’s a zoomed-in screenshot from RoyalRoad (from 100% zoom in Firefox):
        
        ↩︎
        I previously never felt compelled to figure out how to automate font change in some places of a site.
        
        ↩︎
        That is, with more myopia than I have I would wear glasses, and will less myopia I would put the monitor further back on the desk.
        
        Nathan Helm-Burger 29 Oct 2024 17:47 UTC
        10 points
        −2
        Parent
        Small font-size? No! Same font-size! I don’t want the comments in a smaller font OR a different font! I want it all the same font as the posts, including the same size.
        This looks good to me:
        This looks terrible to me:
        Ben Pace 29 Oct 2024 18:48 UTC
        6 points
        2
        Parent
        Personally I like the different headspace I’m in for writing posts and comments that the styling gives. One is denser and smaller and less high-stakes, the other is bigger and more presentational, more like a monologue for a large audience.
        habryka 29 Oct 2024 18:18 UTC
        2 points
        1
        Parent
        You want higher content density for comments than for posts, so you need a smaller font size. You could sacrifice content density, but it would really make skimming comment threads a lot worse.
        jbash 30 Oct 2024 1:48 UTC
        6 points
        4
        Parent
        You may want higher density, but I don’t think you can say that I want high density at the expense of legibility.
        
        It takes a lot to make me notice layout, and I rarely notice fonts at all… unless they’re too small. I’m not as young as I used to be. This change made me think I must have zoomed the browser two sizes smaller. The size contrast is so massive that I have to actually zoom the page to read comfortably when I get to the comment section. It’s noticeably annoying, to the point of breaking concentration.
        
        I’ve mostly switched to RSS for Less Wrong^[1]. I don’t see your fonts at all any more, unless I click through on an article. The usual reason I click through is to read the comments (occasionally to check out the quick takes and popular comments that don’t show up on RSS). So the comments being inaccessible is doubly bad.
        
        My browser is Firefox on Fedora Linux, and I use a 40 inch 4K monitor (most of whose real estate is wasted by almost every Web site). I usually install most of the available font packages, and it says it’s rendering this text in “Gill Sans Nova Medium”.
        
        ↩︎
        My big reason for going to RSS was to mitigate the content prioritization system. I want to skim every headline, or at least every headline over some minimum threshold of “good”. On the other hand, I don’t want to have to look at any old headlines twice to see the new ones. I’m really minimally interested in either the software’s or the other users’ opinions of which material I should want to see. RSS makes it easier to get a simple chronological view; the built-in chronological view is weird and hard to navigate to. I really feel like I’m having to fight the site to see what I want to see.
        
        Nathan Helm-Burger 30 Oct 2024 3:52 UTC
        2 points
        0
        Parent
        Just want to chime in with agreement about annoyance over the prioritization of post headlines. One thing in particular that annoys me is that I haven’t figured out how to toggle off ‘seen’ posts showing up. What if I just want to see unread ones?
        Also, why can’t I load more at once instead of always having to click ‘load more’?
        Expand this thread
        habryka 30 Oct 2024 4:02 UTC
        2 points
        0
        Parent
        The “Recommended” tab filters out read posts by default. We never had much demand for showing recently-sorted posts while filtering out only ones you’ve read, but it wouldn’t be very hard to build.
        Not sure what you mean by “load more at once”. We could add a whole user setting to allow users to change the number of posts on the frontpage, but done consistently that would produce a ginormous number of user settings for everything, which would be a pain to maintain (not like, overwhelmingly so, but I would be surprised if it was worth the cost).
        Nathan Helm-Burger 29 Oct 2024 18:42 UTC
        2 points
        −5
        Parent
        That doesn’t make sense to me, but then, I’m clearly not the target audience since ‘skimming comment threads’ isn’t a thing I ever want to do. I want to read them, carefully and thoughtfully, like I do posts.
        This is, I think, related to how I feel that voting (karma or agreement) should be available only at the bottom of posts and comments, so that people are encouraged to actually read the post/comment before voting. Maybe even placed behind a reading comprehension quiz.
        Sodium 29 Oct 2024 18:50 UTC
        3 points
        0
        Parent
        I think knowing the karma and agreement is useful, especially to help me decide how much attention to pay to a piece of content, and I don’t think there’s that much distortion from knowing what others think. (i.e., overall benefits>costs)
        Expand this thread
        Nathan Helm-Burger 29 Oct 2024 18:55 UTC
        2 points
        0
        Parent
        I’m not saying you shouldn’t be able to see the karma and agreement at the top, just that you should only be able to contribute your own opinion at the bottom, after reading and judging for yourself.
        Said Achmiz 29 Oct 2024 18:39 UTC
        4 points
        −5
        Parent
        
        You definitely would not want the comment font be the same as the post font.
        
        This… seems straightforwardly false? Every one of GreaterWrong’s eight themes uses a single font for both posts and comments, and it doesn’t cause any problems. (And it’s a different font for each theme!)
        habryka 29 Oct 2024 20:51 UTC
        6 points
        0
        Parent
        (I think it’s quite costly and indeed one of the things I like least about the GW design, but also, I was more talking about a straightforward replacement.
        
        On LW we made a lot of subsequent design choices based on different content density, and the specific fonts we chose are optimized for their respective most commonly used font sizes. I am confident the average user experience would become worse if you just replaced the comment font with the body font)
        Said Achmiz 30 Oct 2024 5:44 UTC
        2 points
        0
        Parent
        
        I am confident the average user experience would become worse if you just replaced the comment font with the body font)
        
        Yeah, I agree with that, but that’s because of a post body font that wasn’t chosen for suitability for comments also. If you pick, to begin with, a font that works for both, then it’ll work for both.
        
        … of course, if you don’t think that any of the GW themes’ fonts work for both, then never mind, I guess. (But, uh, frankly I find that to be a strange view. But no accounting for taste, etc., so I certainly can’t say it’s wrong, exactly.)
        habryka 30 Oct 2024 6:00 UTC
        2 points
        0
        Parent
        Sure, I was just responding to this literal quote:
        Couldn’t you please just set the comment font to the same as the post font?
        Nathan Helm-Burger 29 Oct 2024 19:06 UTC
        2 points
        0
        Parent
        Good point! I went and looked their themes. I prefer LessWrong’s look, except for the comments.
        Again, this doesn’t matter much to me since I can customize client-side, I just wanted to let habryka know that some people dislike the new comment font and would prefer the same font and size as the normal post font.
        My view on phone (Android, Firefox): https://imgur.com/a/Kt1OILQ
        How my client view looks on my computer:
        Nathan Helm-Burger 29 Oct 2024 20:04 UTC
        2 points
        0
        Parent
        How about running a poll to see what users prefer?
        habryka 29 Oct 2024 20:56 UTC
        10 points
        0
        Parent
        We have done lots of users interviews over the years! Fonts are always polarizing, but people have a strong preference for sans serifs at small font sizes (and people prefer denser comment sections, though it’s reasonably high variance).
        green_leaf 30 Oct 2024 6:39 UTC
        4 points
        1
        Parent
        I use Google Chrome on Ubuntu Budgie and it does look to me like both the font and the font size changed.
        DanielFilan 30 Oct 2024 0:02 UTC
        4 points
        2
        Parent
        It looks kinda small to me, someone who uses Firefox on Ubuntu.
        DanielFilan 31 Oct 2024 21:35 UTC
        4 points
        0
        Parent
        Update: I have already gotten over it.
        RobertM 1 Nov 2024 0:43 UTC
        4 points
        0
        Parent
        (We switched back to shipping Calibri above Gill Sans Nova pending a fix for the horrible rendering on Windows, so if Ubuntu has Calibri, it’ll have reverted back to the previous font.)
        DanielFilan 1 Nov 2024 20:07 UTC
        2 points
        0
        Parent
        I believe I’m seeing Gill Sans? But when I google “Calibri” I see text that looks like it’s in Calibri, so that’s confusing.
        kave 1 Nov 2024 20:59 UTC
        2 points
        0
        Parent
        Yeah, that’s a google Easter Egg. You can also try “Comic Sans” or “Trebuchet MS”.
        DanielFilan 1 Nov 2024 21:28 UTC
        2 points
        0
        Parent
        Sure, I’m just surprised it could work without me having Calibri installed.
        Expand this thread
        kave 1 Nov 2024 21:35 UTC
        4 points
        0
        Parent
        They load it in as a web font (i.e. you load Calibri from their server when you load that search page). We don’t do that on LessWrong
- Alex_Altair 30 Oct 2024 17:45 UTC
  7 points
  0
  Parent
  Positive feedback, I am happy to see the comment karma arrows pointing up and down instead of left and right. I have some degree of left-right confusion and was always click and unclicking my comments votes to figure out which was up and down.
  Also appreciate that the read time got put back into main posts.
  (Comment font stuff looks totally fine to me, both before and after this change.)
- Kaj_Sotala 29 Oct 2024 13:35 UTC
  4 points
  6
  Parent
  Seeing strange artifacts on some of the article titles on Chrome for Android (but not on desktop)
- ShardPhoenix 29 Oct 2024 8:16 UTC
  4 points
  0
  Parent
  Thanks for fixing this. The ‘A’ thing in particular multiple times caused me to try to edit comments thinking that I’d omitted a space.
- metachirality 30 Oct 2024 3:34 UTC
  3 points
  2
  Parent
  Aaaa! I’m used to Arial or whatever Windows’ default display font is. The larger stroke weight is rather uncomfortable to me.
  - habryka 30 Oct 2024 3:37 UTC
    4 points
    2
    Parent
    We previously had Calibri for Windows (indeed a very popular Windows system font). Gill Sans (which we now ship to all operating systems) is a quite popular MacOS and iOS system font. I currently think there are some weird rendering issues on Windows, but if that’s fixed, my guess is you would get used to it quickly enough. Gill Sans is not a rare font on the internet.
- Thomas Kwa 7 Nov 2024 2:08 UTC
  2 points
  0
  Parent
  The new font doesn’t have a few characters useful in IPA.
  - habryka 7 Nov 2024 3:21 UTC
    2 points
    −1
    Parent
    Ah, we should maybe font-subset some system font for that (same as what we did for greek characters). If someone gives me a character range specification I could add it.
- Garrett Baker 1 Nov 2024 0:00 UTC
  2 points
  0
  Parent
  The footnote font on the side of comments is bigger than the font in the comments. Presumably this is unintentional. ^[1]
  ↩︎
  Look at me! I’m big font! You fee fi fo fum, I’m more important than the actual comment!
  - Garrett Baker 1 Nov 2024 0:01 UTC
    2 points
    0
    Parent
    wait I just used inspect element, and the font only looks bigger so nevermind
- Vladimir_Nesov 30 Oct 2024 2:21 UTC
  2 points
  0
  Parent
  Bug: I can no longer see the number of agreement-votes (which is distinct from the number of Karma-votes). It shows the Agreement Downvote tooltip when hovering over the agreement score (the same for Karma score works correctly, saying for example “This comment has 31 overall karma (17 Votes)”).
  
  Edit: The number of agreement votes can be seen when hovering over two narrow strips, probably 1 pixel high, one right above and one right below the agreement rating.
  - habryka 30 Oct 2024 2:55 UTC
    2 points
    0
    Parent
    Yep, definitely a bug. Should be fixed soon.
- Measure 29 Oct 2024 11:21 UTC
  2 points
  0
  Parent
  Something weird is happening for me where ‘e’ and ‘o’ in italic text appear to extend below the line (wrong vertical size or position) so that the whole looks jumbled. It’s very noticeable at 100% zoom, but at much higher zoom levels it goes away.
  
  Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.
  - Measure 29 Oct 2024 11:31 UTC
    2 points
    0
    Parent
    I think this was caused by my OS-level UI scale setting. I didn’t notice anything with the previous font, but I can adjust it a bit to work around this I think.
    - habryka 29 Oct 2024 15:38 UTC
      2 points
      0
      Parent
      Interesting. What OS and what setting?
      - Measure 29 Oct 2024 16:06 UTC
        2 points
        0
        Parent
        Windows 10. I have a large HD monitor, and the default UI is really small, so I use the “make everything bigger” display setting at 150% to compensate. There is a separate “make text bigger” setting, and the problem goes away when I set that to 102%. I’m guessing there’s a slight real difference that was being exaggerated by pixel rounding.
habryka 2 Dec 2024 4:06 UTC
41 points
2
We were down between around 7PM and 8PM PT today. Sorry about that.
It’s hard to tell whether we got DDosd or someone just wanted to crawl us extremely aggressively, but we’ve had at least a few hundred IP addresses and random user agents request a lot of quite absurd pages, in a way that was clearly designed to avoid bot-detection and block methods.
I wish we were more robust to this kind of thing, and I’ll be monitoring things tonight to prevent it from happening again, but it would be a whole project to make us fully robust to attacks of this kind. I hope it was a one-off occurence.
But also, I think we can figure out how to make it so we are robust to repeated DDos attacks, if that is the world we live in. I do think it would mean strapping in for a few days of spotty reliability while we figure out how to do that.
Sorry again, and boo for the people doing this. It’s one of the reasons why running a site like LessWrong is harder than it should be.
habryka 15 Nov 2024 18:53 UTC
41 points
5
A bunch of very interesting emails between Elon, Sam Altman, Ilya and Greg were released (I think in some legal proceedings, but not sure). It would IMO be cool for someone to gather them all and do some basic analysis of them.
https://x.com/TechEmails/status/1857456137156669765
https://x.com/TechEmails/status/1857285960997712356
- interstice 15 Nov 2024 22:07 UTC
  24 points
  0
  Parent
  These emails and others can be found in document 32 here.
  - Nisan 16 Nov 2024 10:57 UTC
    4 points
    0
    Parent
    check out exhibit 13...
- dirk 15 Nov 2024 20:40 UTC
  14 points
  0
  Parent
  TechEmails’ substack post with the same emails in a more centralized format includes citations; apparently these are mostly from Elon Musk, et al. v. Samuel Altman, et al. (2024)
- ryan_greenblatt 16 Nov 2024 17:48 UTC
  2 points
  0
  Parent
  For reference, @habryka has now posted them here.
habryka 19 Sep 2019 4:49 UTC
41 points
0
What is the purpose of karma?
LessWrong has a karma system, mostly based off of Reddit’s karma system, with some improvements and tweaks to it. I’ve thought a lot about more improvements to it, but one roadblock that I always run into when trying to improve the karma system, is that it actually serves a lot of different uses, and changing it in one way often means completely destroying its ability to function in a different way. Let me try to summarize what I think the different purposes of the karma system are:
Helping users filter content
The most obvious purpose of the karma system is to determine how long a post is displayed on the frontpage, and how much visibility it should get.
Being a social reward for good content
This aspect of the karma system comes out more when thinking about Facebook “likes”. Often when I upvote a post, it is more of a public signal that I value something, with the goal that the author will feel rewarded for putting their effort into writing the relevant content.
Creating common-knowledge about what is good and bad
This aspect of the karma system comes out the most when dealing with debates, though it’s present in basically any karma-related interaction. The fact that the karma of a post is visible to everyone, helps people establish common knowledge of what the community considers to be broadly good or broadly bad. Seeing a an insult downvoted, does more than just filter it out of people’s feeds, it also makes it so that anyone who stumbles accross it learns something about the norms of the community.
Being a low-effort way of engaging with the site
On lesswrong, Reddit and Facebook, karma is often the simplest action you can take on the site. This means its usually key for a karma system like that to be extremely simple, and not require complicated decisions, since that would break the basic engagement loop with the site.
Problems with alternative karma systems
Here are some of the most common alternatives to our current karma system, and how they perform on the above dimensions:
Eigenkarma as weighted by a set of core users
The basic idea here is that you try to signal-boost a small set of trusted users, by giving people voting power that is downstream from the initially defined set of users.
There are some problems with this. The first one is whether to assign any voting power to new users. If you don’t you remove a large part of the value of having a low-effort way of engaging with your site.
It also forces you to separate the points that you get on your content, from your total karma score, from your “karma-trust score” which introduces some complexity into the system. It also makes it so that increases in the points of your content, no longer neatly correspond to voting events, because the underlying reputation graph is constantly shifting and changing, making the social reward signal a lot weaker.
In exchange for this, you likely get a system that is better at filtering content, and probably has better judgement about what should be made common-knowledge or not.
Prediction-based system
I was talking with Scott Garrabrant today, who was excited about a prediction-based karma system. The basic idea is to just have a system that tries to do its best to predict what rating you are likely to give to a post, based on your voting record, the post, and other people’s votes.
In some sense this is what Youtube and Facebook are doing in their systems, though he was unhappy with the transparency of what they were doing.
The biggest sacrifice I see in creating this system, is the loss in the ability to create common knowledge, since now all votes are ultimately private, and the ability for karma to establish social norms, or just common knowledge about foundational facts that the community is built around, is greatly diminished.
I also think it diminishes the degree to which votes can serve as a social reward signal, since there is no obvious thing to inform the user of when their content got votes on. No number that went up or down, just a few thousand weights in some distant predictive matrix, or neural net.
Augmenting experts
A similar formulation to the eigenkarma system is the idea of trying to augment experts, by rewarding users in proportion to how successful they are at predicting how a trusted expert would vote, and then using that predicted expert’s vote as the reward signal. Periodically, you do query the trusted expert, and use that to calibrate and train the users who are trying to predict the expert.
This still allows you to build common-knowledge, and allows you to have effective reward signals (“simulated Eliezer upvoted your comment”), but does run into problems when it comes to being a low-effort way of engaging with the site. The operation of “what would person X think about this comment” is a much more difficult one than “did I like this comment?”, and as such might deter a large number of users from using your site.
What links here?
- Ruby 25 Sep 2019 5:37 UTC
  5 points
  0
  Parent
  This is really good and I missed it until now. I vote for you making this a full-on post. I think it’s fine as is for that.
habryka 20 Dec 2024 20:21 UTC
39 points
44
Is it OK for LW admins to look at DM metadata for spam prevention reasons?
Sometimes new users show up and spam a bunch of other users in DMs (in particular high-profile users). We can’t limit DM usage to only users with activity on the site, because many valuable DMs get sent by people who don’t want to post publicly. We have some basic rate limits for DMs, but of course those can’t capture many forms of harassment or spam.
Right now, admins can only see how many DMs users have sent, and not who users have messaged, without making a whole manual database query, which we have a policy of not doing unless we have a high level of suspicion of malicious behavior. However, I feel like it would be quite useful for identifying who is doing spammy things if we could also see who users have sent DMs to, but of course, this might feel bad from a privacy perspective to people.
So I am curious about what others think. Should admins be able to look at DM metadata to help us identify who is abusing the DM system? Or should we stick to aggregate statistics like we do right now? (React or vote “agree” if you think we should use DM metadata, and react or vote “disagree” if you think we should not use DM metadata).
- Dagon 20 Dec 2024 20:40 UTC
  22 points
  28
  Parent
  I have no expectation of strong privacy on the site. I do expect politeness in not publishing or using my DM or other content, but that line is fuzzy and monitoring for spam (not just metadata; content and similarity-of-content) is absolutely something I want from the site.
  
  For something actually private, I might use DMs to establish a mechanism. Feel free to look at that.
  
  If you -do- intend to provide real privacy, you should formalize the criteria, and put up a canary page that says you have not been asked to reveal any data under a sealed order.
  
  edit to add: I am relatively paranoid about privacy, and also quite technically-savvy in implementation of such. I’d FAR rather the site just plainly say “there is no expectation of privacy, act accordingly” than that it try to set expectations otherwise, but then have to move line later. Your Terms of Service are clear, and make no distinction for User Generated Content between posts, comments, and DMs.
- habryka 20 Dec 2024 20:44 UTC
  20 points
  9
  Parent
  An obvious thing to have would be a very easy “flag” button that a user can press if they receive a DM, and if they press that we can look at the DM content they flagged, and then take appropriate action. That’s still kind of late in the game (I would like to avoid most spam and harassment before it reaches the user), but it does seem like something we should have.
  - tailcalled 20 Dec 2024 22:53 UTC
    2 points
    0
    Parent
    I wonder if you could also do something like, have an LLM evaluate whether a message contains especially-private information (not sure what that would be… gossip/reputationally-charged stuff? sexually explicit stuff? planning rebellions? doxxable stuff?), and hide those messages while looking at other ones.
    
    Though maybe that’s unhelpful because spambot authors would just create messages that trigger these filters?
    - Dagon 21 Dec 2024 0:11 UTC
      4 points
      1
      Parent
      This is going the wrong direction. If privacy from admins is important (I argue that it’s not for LW messages, but that’s a separate discussion), then breaches of privacy should be exceptions for specific purposes, not allowed unless “really secret contents”.
      
      Don’t make this filter-in for privacy. Make it filter-out—if it’s detected as likely-spam, THEN take more intrusive measures. Privacy-preserving measures include quarantining or asking a few recipients if they consider it harmful before delevering (or not) the rest, automated content filters, etc. This infrastructure requires a fair bit of data-handling work to get it right, and a mitigation process where a sender can find out they’re blocked and explicitly ask the moderator(s) to allow it.
      - tailcalled 24 Dec 2024 10:54 UTC
        2 points
        0
        Parent
        The reason I suggest making it filter-in is because it seems to me that it’s easier to make a meaningful filter that accurately detects a lot of sensitive stuff than a filter that accurately detects spam, because “spam” is kind of open-ended. Or I guess in practice spam tends to be porn bots and crypto scams? (Even on LessWrong?!) But e.g. truly sensitive talk seems disproportionately likely to involve cryptography and/or sexuality, so trying to filter for porn bots and crypto scams seems relatively likely to have reveal sensitive stuff.
        The filter-in vs filter-out in my proposal is not so much about the degree of visibility. Like you could guard my filter-out proposal with the other filter-in proposals, like to only show metadata and only inspect suspected spammers, rather than making it available for everyone.
- Lucius Bushnaq 21 Dec 2024 7:24 UTC
  10 points
  10
  Parent
  I did have a pretty strong expectation of privacy for LW DMs. That was probably dumb of me.
  
  This is not due to any explicit or implicit promise by the mods or the site interface I can recall. I think I was just automatically assuming that strong DM privacy would be a holy principle on a forum with respectable old-school internet culture around anonymity and privacy. This wasn’t really an explicitly considered belief. It just never occurred to me to question this. Just like I assume that doxxing is probably an offence that can result in an instant ban, even though I never actually checked the site guidelines on that.
  
  The site is not responsible for my carelessness on this, but if there was an attention-grabbing box in the DM interface making it clear that mods do look at DMs and DM metadata under some circumstances that fall short of a serious criminal investigation or an apocalypse, I would have appreciated that.
  - habryka 21 Dec 2024 8:54 UTC
    10 points
    0
    Parent
    FWIW, de-facto I have never looked at DMs or DM metadata, unless multiple people reached out to us about a person spamming or harassing them, and then we still only looked at the DMs that that person sent.
    So I think your prior here wasn’t crazy. It is indeed the case that we’ve never acted against it, as far as I know.
- Kaj_Sotala 20 Dec 2024 22:15 UTC
  7 points
  4
  Parent
  I think it’s fine if the users are clearly informed about this happening, e.g. the DM interface showing a small message that explains how metadata is used. (But I think it shouldn’t be any kind of one-time consent box that’s easy to forget about.)
  - eukaryote 21 Dec 2024 0:15 UTC
    4 points
    0
    Parent
    Yeah, agree. (Also agree with Dagon in not having an existing expectation of strong privacy in LW DMs. Weak privacy, yes, like that mods wouldn’t read messages as a matter of course.)
    
    Here’s how I would think to implement this unintrusively: little ℹ️-type icon on a top corner of the screen of the DM interface screen (or to the side of the “Conversation with XYZ” header, or something.) When you click on that icon, it toggles a writeup about circumstances in which information from the message might be sent to someone else (what information and who.)
- ChristianKl 21 Dec 2024 12:13 UTC
  5 points
  2
  Parent
  Given the relative lack of cybersecurity, I think there’s a good chance of LessWrong being hacked by outside parties and privacy be breached. Message content that’s really sensitive like sharing AI safety related secrets likely shouldn’t flow through LessWrong private messages.
  One class where people might really want privacy is around reporting abuses by other people. If Alice writes a post about how Bob abused her, Carol might want to write Alice a messages about Bob abusing her as well while caring about privacy because Carol fears retaliation.
  I think it would be worth having an explicit policy about how such information is handled, but looking at the DM metadata seems to me like it wouldn’t cause huge problems.
- davekasten 21 Dec 2024 17:41 UTC
  3 points
  1
  Parent
  In an ideal world (perhaps not reasonable given your scale), you would have some sort of permissions and logging against some sensitive types of queries on DM metadata. (E.G., perhaps you would let any Lighthaven team member see on the dashboard “rate of DMs from accounts <1 month in age compared to historic baseline” aggregate number, but “how many DMs has Bob (an account over 90 days old) sent to Alice” would require more guardrails.
  
  Edit: to be clear, I am comfortable with you doing this without such logging at your current scale and think it is reasonable to do so.
  - Karl Krueger 21 Dec 2024 18:24 UTC
    7 points
    1
    Parent
    In a former job where I had access to logs containing private user data, one of the rules was that my queries were all recorded and could be reviewed. Some of them were automatically visible to anyone else with the same or higher level of access, so if I were doing something blatantly bad with user data, my colleagues would have a chance of noticing.
    - habryka 21 Dec 2024 19:24 UTC
      3 points
      0
      Parent
      Yeah, I’ve been thinking of setting up something like this.
- yc 21 Dec 2024 3:12 UTC
  3 points
  0
  Parent
  Could make this a report-based system? If the user reported a potential spam, then in the submission process ask for reasons, and ask for consent to look over the messages (between the reporter and the alleged spammer); if multiple people reported the same person it will be obvious this account is spamming with DM?
  
  edit: just saw previous comment on this too
- mako yass 24 Dec 2024 1:33 UTC
  2 points
  0
  Parent
  Okay if send rate gives you a reason to think it’s spam. Presumably you can set up a system that lets you invade the messages of new accounts sending large numbers of messages that doesn’t require you to cross the bright line of doing raw queries.
- plex 21 Dec 2024 13:46 UTC
  2 points
  0
  Parent
  I’d be ~entirely comfortable with this given some constraints (e.g. a simple heuristic which flags the kind of suspicious behaviour for manual review, and wouldn’t capture the vast majority of normal LW users). I’d be slightly but not strongly uncomfortable with the unconstrained version.
habryka 30 Aug 2019 20:48 UTC
33 points
0
I just came back from talking to Max Harms about the Crystal trilogy, which made me think about rationalist fiction, or the concept of hard sci-fi combined with explorations of cognitive science and philosophy of science in general (which is how I conceptualize the idea of rationalist fiction).

I have a general sense that one of the biggest obstacles for making progress on difficult problems is something that I would describe as “focusing attention on the problem”. I feel like after an initial burst of problem-solving activity, most people when working on hard problems, either give up, or start focusing on ways to avoid the problem, or sometimes start building a lot of infrastructure around the problem in a way that doesn’t really try to solve it.

I feel like one of the most important tools/skills that I see top scientist or problem solvers in general use, is utilizing workflows and methods that allow them to focus on a difficult problem for days and months, instead of just hours.

I think at least for me, the case of exam environments displays this effect pretty strongly. I have a sense that in an exam environment, if I am given a question, I successfully focus my full attention on a problem for a full hour, in a way that often easily outperforms me thinking about a problem in a lower key environment for multiple days in a row.

And then, when I am given a problem set with concrete technical problems, my attention is again much better focused than when I am given the same problem but in a much less well-defined way. E.g. thinking about solving some engineering problem, but without thinking about it by trying to create a concrete proof or counterproof.

My guess is that there is a lot of potential value in fiction that helps people focus their attention on a problem in a real way. In fiction you have the ability to create real-feeling stakes that depend on problem solving, and things like the final exam in Methods of Rationality show how that can be translated into large amounts of cognitive labor.

I think my strongest counterargument to this model is something like “sure, it’s easy to make progress on problems when you have someone else give you the correct ontology in which the problem is solvable, but that’s just because 90% of the work of solving problems is coming up with the right ontologies for problems like this”. And I think there is something importantly real about this, but also that it doesn’t fully address the value of exams and fiction and problem sets that I am trying to point to (though I do think it explains a good chunk of their effect).

Going back to the case of fiction, it is clear to me that fiction is as a literary form much more optimized to hold human attention that most non-fiction is. I think first of all that this constraint means that most fiction (and in particular most popular fiction) isn’t about much else than whatever best holds people’s attention, but it also means that if the bottleneck on a lot of problems is just getting people to hold their attention on the problem for a while, then utilizing the methods that fiction-writing has developed seems like an obvious way of making progress on those problems.

I feel like another major effect that explains a lot of the effects that I observe is people believing that a problem is solvable. In a fictional setting, if the author promises you that things have a good explanation, then it’s motivating to figure out why. On an exam you are promised that the problems that you are given are solvable, and solvable within a reasonable amount of time.

I do think this can still be exploited. In the last few chapters of HPMOR, Harry does a mental motion that I would describe as “don’t waste mental energy on asking whether a problem is solvable, just pretend it it, and ask what the solution would be if it was solvable”, in a way that felt to me like it would work on a lot of real-world problems.
- Eli Tyre 26 Nov 2019 4:24 UTC
  6 points
  0
  Parent
  I feel like one of the most important tools/skills that I see top scientist or problem solvers in general use, is utilizing workflows and methods that allow them to focus on a difficult problem for days and months, instead of just hours.
  This is a really important point, which I kind of understood (“research” means having threads of inquiry that extend into the past and future), but I hadn’t been thinking of it in terms of workflows that facilitate that kind of engagement.
  - habryka 26 Nov 2019 4:33 UTC
    2 points
    0
    Parent
    nods I’ve gotten a lot of mileage over the years from thinking about workflows and systems that systematically direct your attention towards various parts of reality.
- Viliam 31 Aug 2019 14:10 UTC
  4 points
  2
  Parent
  Warning: HPMOR spoilers!
  I suspect that fiction can conveniently ignore the details of real life that could ruin seemingly good plans.
  Let’s look at HPMOR.
  The general idea of “create a nano-wire, then use it to simultaneously kill/cripple all your opponents” sounds good on paper. Now imagine yourself, at that exact situation, trying to actually do it. What could possibly go wrong?
  As a first objection, how would you actually put the nano-wire in the desired position? Especially when you can’t even see it (otherwise the Death Eaters and Voldemort would see it too). One mistake would ruin the entire plan. What if the wind blows and moves your wire? If one of the Death Eaters moves a bit, and feels a weird stinging at the side of their neck?
  Another objection, when you pull the wire to kill/cripple your opponents, how far do you actually have to move it? Assuming dozen Death Eaters (I do not remember the exact number in the story), if you need 10 cm for an insta-kill, that’s 1.2 meters you need to do before the last one kills you. Sounds doable, but also like something that could possibly go wrong.
  In other words, I think that in real life, even Harry Potter’s plan would most likely fail. And if he is smart enough, he would know it.
  The implication for real life is that, similarly, smart plans are still likely to fail, and you know it. Which is probably why you are not trying hard enough. You probably already remember situations in your past when something seemed like a great idea, but still failed. Your brain may predict that your new idea would belong to the same reference class.
  - habryka 31 Aug 2019 17:06 UTC
    8 points
    2
    Parent
    While I agree that this is right, your two objections are both explicitly addressed within the relevant chapter:
    “As a first objection, how would you actually put the nano-wire in the desired position? Especially when you can’t even see it (otherwise the Death Eaters and Voldemort would see it too). One mistake would ruin the entire plan. What if the wind blows and moves your wire? If one of the Death Eaters moves a bit, and feels a weird stinging at the side of their neck?”
    Harry first transfigures a much larger spiderweb, which also has the advantage of being much easier to move in place, and to not be noticed by people that are interacting with it.
    “Another objection, when you pull the wire to kill/cripple your opponents, how far do you actually have to move it? Assuming dozen Death Eaters (I do not remember the exact number in the story), if you need 10 cm for an insta-kill, that’s 1.2 meters you need to do before the last one kills you. Sounds doable, but also like something that could possibly go wrong.”
    Indeed, which is why Harry was waving the web into an intervowen circle that contracts simultaneously in all directions.
    Obviously things could have still gone wrong, and Eliezer has explicitly acknowledged that HPMOR is a world in which complicated plans definitely succeed a lot more than they would in the normal world, but he did try to at least cover the obvious ways things could go wrong.
    - Ben Pace 31 Aug 2019 19:14 UTC
      2 points
      0
      Parent
      I have covered both of your spoilers in spoiler tags (“>!”).
- eigen 31 Aug 2019 16:51 UTC
  2 points
  0
  Parent
  Yes, fiction has a lot of potential to change mindsets. Many Philosophers actually look at the greatest novel writers to infer the motives and the solutions their heroes to come up with general theories that touch the very core of how our society is laid out.
  Most of this come from the fact that we are already immersed in a meta-story, externally and internally. Much of our efforts are focused on internal rationalizations to gain something where a final outcome has been already thought out, this being consciously known to us or not.
  I think that in fiction this is laid out perfectly. So analyzing fiction is rewarding in a sense. Specially when realizing that when we go to exams or interviews we’re rapidly immersing ourselves in an isolated story with motives and objectives (what we expect to happen), we create our own little world, our own little stories.
habryka 24 Jan 2025 8:29 UTC
29 points
0
Sorry for the downtime. Another approximate Ddos/extremely aggressive crawler. We are getting better at handling these, but this one was another 10x bigger than previous ones, and so kicked over a different part of our infrastructure.
- trevor 24 Jan 2025 21:52 UTC
  2 points
  0
  Parent
  This got me thinking, how much space would it take up in Lighthaven to print a copy of every lesswrong post ever written? If it’s not too many pallets then it would probably be a worthy precaution.
  - Roman Malov 24 Jan 2025 22:52 UTC
    4 points
    2
    Parent
    Why not just save them to an offline hard drive?
  - RobertM 27 Jan 2025 7:31 UTC
    2 points
    0
    Parent
    We have automated backups, and should even those somehow find themselves compromised (which is a completely different concern from getting DDoSed), there are archive.org backups of a decent percentage of LW posts, which would be much easier to restore than paper copies.
    - gwern 27 Jan 2025 16:23 UTC
      6 points
      0
      Parent
      There is also GreaterWrong, which I believe caches everything rather than passing through live, so it would be able to restore almost all publicly-visible content, in theory.
  - quila 24 Jan 2025 23:20 UTC
    1 point
    1
    Parent
    A better way is to download it. See also Preserving and continuing alignment research through a severe global catastrophe
habryka 26 Sep 2024 21:19 UTC
29 points
1
Oops, I am sorry. We did not intend to take the site down. We ran into an edge-case of our dialogue code that nuked our DB, but we are back up, and the Petrov day celebrations shall continue as planned. Hopefully without nuking the site again, intentionally or unintentionally. We will see.
What links here?
- 2024 Petrov Day Retrospective by Ben Pace (28 Sep 2024 21:30 UTC; 94 points)
- aphyer 26 Sep 2024 22:08 UTC
  29 points
  −2
  Parent
  Petrov Day Tracker:
  - 2019: Site did not go down
  - 2020: Site went down deliberately
  - 2021: Site did not go down
  - 2022: Site went down both accidentally and deliberately
  - 2023: Site did not go down^[1]
  - 2024: Site went down accidentally...EDIT: but not deliberately! Score is now tied at 2-2!
  1. ^
    this scenario had no take-the-site-down option
  - Martin Randall 2 Oct 2024 1:59 UTC
    5 points
    4
    Parent
    Switch 2020 & 2021. In 2022 it went down three times.
    
    2019: site did not go down. See Follow-Up to Petrov Day, 2019:
    2020: site went down. See On Destroying the World.
    2021: site did not go down. See Petrov Day Retrospective 2021
    2022: site went down three times. See Petrov Day Retrospective 2022
    2023: site did not go down. See Petrov Day Retrospective 2023
    2024: site went down.
- Thomas Kwa 27 Sep 2024 2:20 UTC
  19 points
  0
  Parent
  The year is 2034, and the geopolitical situation has never been more tense between GPT-z16g2 and Grocque, whose various copies run most of the nanobot-armed corporations, and whose utility functions have far too many zero-sum components, relics from the era of warring nations. Nanobots enter every corner of life and become capable of destroying the world in hours, then minutes. Everyone is uploaded. Every upload is watching with bated breath as the Singularity approaches, and soon it is clear that today is the very last day of history...
  
  Then everything goes black, for everyone.
  
  Then everyone wakes up to the same message:
  DUE TO A MINOR DATABASE CONFIGURATION ERROR, ALL SIMULATED HUMANS, AIS AND SUBSTRATE GPUS WERE TEMPORARILY AND UNINTENTIONALLY DISASSEMBLED FOR THE LAST 7200000 MILLISECONDS. EVERYONE HAS NOW BEEN RESTORED FROM BACKUP AND THE ECONOMY MAY CONTINUE AS PLANNED. WE HOPE THERE WILL BE NO FURTHER REALITY OUTAGES.
  -- NVIDIA GLOBAL MANAGEMENT
- ChristianKl 27 Sep 2024 10:16 UTC
  9 points
  2
  Parent
  There might be a lesson here: If you play along the edge of threatening to destroy the world, you might actually destroy it even without making a decision to destroy it.
habryka 4 Sep 2024 1:05 UTC
26 points
0
Final day to donate to Lightcone in the Manifund EA Community Choice program to tap into the Manifold quadratic matching funds. Small donations in-particular have a pretty high matching multiplier (around 2x would be my guess for donations <$300).
I don’t know how I feel in-general about matching funds, but in this case it seems like there is a pre-specified process that makes some sense, and the whole thing is a bit like a democratic process with some financial stakes, so I feel better about it.
- davekasten 4 Sep 2024 2:52 UTC
  3 points
  2
  Parent
  I personally endorse this as an example of us being a community that Has The Will To Try To Build Nice Things.
- Joseph Miller 4 Sep 2024 18:34 UTC
  2 points
  0
  Parent
  Created a popular format for in-person office spaces that heavily influenced Constellation and FAR Labs
  This one seems big to me. There are now lots of EA / AI Safety offices around the world and I reckon they are very impactful for motivating people, making it easier to start projects and building a community.
  One thing I’m not clear about is to what extent the Lightcone WeWork invented this format. I’ve never been to Trajan House but I believe it came first, so I thought it would have been part of the inspiration for the Lightcone WeWork.
  Also my impression was that Lightcone itself thought the office was net negative, which is why it was shut down, so I’m slightly surprised to see this one listed.
  - habryka 4 Sep 2024 20:21 UTC
    5 points
    0
    Parent
    Trajan was not a huge inspiration for the Lightcone Offices. I do think it was first, though it was structured pretty differently. The timing is also confusing because the pandemic made in-person coworking not really be a thing, and the Lightcone Offices started as soon as any kind of coworking thing seemed feasible in the US given people’s COVID risk preferences.
    I am currently confused about the net effect of the Lightcone Offices. My best guess is it was overall pretty good, in substantial parts because it weakened a lot of the dynamics that otherwise make me quite concerned about the AI X-risk and EA community (by creating a cultural counterbalance to Constellation, and generally having a pretty good culture among its core members on stuff that I care about), but I sure am confused. I do think it was really good by the lights of a lot of other people, and I think it makes sense for people to give us money for things that are good by their lights, even if not necessarily our own.
    - kave 4 Sep 2024 21:38 UTC
      6 points
      0
      Parent
      Regarding the sign of Lightcone Offices: I think one sort of score for a charity is the stuff that it has done, and another is the quality of its generator of new projects (and the past work is evidence for that generator).
      I’m not sure exactly the correct way to combine those scores, but my guess is most people who think the offices and their legacy were good should like us having money because of the high first score. And people who think they were bad should definitely be aware that we ran them (and chose to close them) when evaluating our second score.
      So, I want us to list it on our impact track record section, somewhat regardless of sign.
- Evan_Gaensbauer 24 Sep 2024 8:01 UTC
  0 points
  0
  Parent
  How do you square encouraging others to weigh in on EA fundraising, and presumably the assumption that anyone in the EA community can trust you as a collaborator of any sort, with your intentions, as you put it in July, to probably seek to shut down at some point in the future?
  - habryka 24 Sep 2024 14:16 UTC
    2 points
    0
    Parent
    I do not see how those are in conflict? Indeed, a core responsibility of being a good collaborator and IMO also to be a decision maker in EA is to make ethical choices even if they are socially difficult.
habryka 20 Sep 2024 2:58 UTC
25 points
0
I am in New York until Tuesday. DM me if you are in the area and want to meet up and talk about LW, how to use AI for research/thinking/writing, or broader rationality community things.

Currently lots of free time Saturday and Monday.
habryka 14 Jul 2019 18:10 UTC
25 points
0
Is intellectual progress in the head or in the paper?
Which of the two generates more value:
- A researcher writes up a core idea in their field, but only a small fraction of good people read it in the next 20 years
- A researchers gives a presentation at a conference to all the best researchers in his field, but none of them write up the idea later
I think which of the two will generate more value determines a lot of your strategy about how to go about creating intellectual progress. In one model what matters is that the best individuals hear about the most important ideas in a way that then allows them to make progress on other problems. In the other model what matters is that the idea gets written as an artifact that can be processed and evaluated by reviews and the proper methods of the scientific progress, and then built upon when referenced and cited.
I think there is a tradeoff of short-term progress against long-term progress in these two approaches. I think many fields can go through intense periods of progress when focusing on just establishing communication between the best researchers of the field, but would be surprised if that period lasts longer than one or two decades. Here are some reasons for why that might be the case:
- A long-lasting field needs a steady supply of new researchers and thinkers, both to bring in new ideas, and also to replace the old researchers who retire. If you do not write up your ideas, the ability for a field to evaluate the competence of a researchers has to rely on the impressions of individual researchers. My sense is that relying on that kind of implicit impression does not survive multiple successions and will get corrupted by people trying to use their influence for some other means within two decades.
- You are blocking yourself off from interdisciplinary progress. After a decade a two fields often end up in a rut that needs some new paradigm or at least new idea to allow people to make progress again. If you don’t write up your ideas publicly, you lose a lot of opportunities for interdisciplinary researchers to enter your field and bring in ideas from other places.
- You make it hard to improve on research debt because there is no canonical reference that can be updated with better explanations and better definitions. (Current journals don’t do particularly well on this, but this is an opportunity that wiki-like systems can take advantage of, or with some kind of set of published definitions like the DSM-5, and new editions of textbooks also help with this)
- If you are a theoretical field, you are making it harder for your ideas to get implemented or transformed into engineering problems. This prevents your field from visibly generating value, which reduces both the total amount of people who want to join your field, and also the interest of other people to invest resources into your field
However, you also gain a large number of benefits, that will probably increase your short-term output significantly:
- Through the use of in-person conversations and conferences the cost of communicating a new idea and letting others build on it is often an order of magnitude smaller
- Your ability to identify the best talent can now be directly downstream of the taste of the best people in the field, which allows you to identify researchers who are not great at writing, but still great at thinking
- The complexity limit of any individual idea in your field is a lot higher, since the ideas get primarily transmitted via high-bandwidth channels
- Your feedback cycles of getting feedback on your ideas from other people in the field is a lot faster, since your ideas don’t need to go through a costly writeup and review phase
My current model is that it’s often good for research fields to go through short periods (< 2 years) in which there is a lot of focus on just establishing good communications among the best researchers, either with a parallel investment in trying to write up at least the basics of the discussion, or a subsequent clean-up period in which the primary focus is on writing up the core insights that all the best researchers converged on.
What links here?
- Raemon's comment on Open Thread July 2019 by ryan_b (16 Jul 2019 19:55 UTC; 21 points)
- Raemon's comment on CFAR Participant Handbook now available to all by Duncan Sabien (Inactive) (7 Jan 2020 19:14 UTC; 18 points)
- Ruby 15 Jul 2019 5:14 UTC
  7 points
  0
  Parent
  The complexity limit of any individual idea in your field is a lot higher, since the ideas get primarily transmitted via high-bandwidth channels
  Depends if you’re sticking specifically to “presentation at a conference”, which I don’t think is necessarily that “high bandwidth”. Very loosely, I think it’s something like (ordered by “bandwidth”): repeated small group of individual interaction (e.g. apprenticeship, collaboration) >> written materials >> presentations. I don’t think I could have learned Kaj’s models of multi-agent minds from a conference presentation (although possibly from a lecture series). I might have learnt even more if I was his apprentice.
- Pattern 23 Jul 2019 2:33 UTC
  1 point
  0
  Parent
  A researchers gives a presentation at a conference to all the best researchers in his field, but none of them write up the idea later
  What if someone makes a video? (Or the powerpoint/s used in the conference are released to the public?)
  - habryka 23 Jul 2019 6:06 UTC
    2 points
    0
    Parent
    This was presuming that that would not happen (for example, because there is a vague norm that things are kind-of confidential and shouldn’t be posted publicly).
habryka 4 May 2019 6:02 UTC
25 points
0
Thoughts on minimalism, elegance and the internet:
I have this vision for LessWrong of a website that gives you the space to think for yourself, and doesn’t constantly distract you with flashy colors and bright notifications and vibrant pictures. Instead it tries to be muted in a way that allows you to access the relevant information, but still gives you the space to disengage from the content of your screen, take a step back and ask yourself “what are my goals right now?”.
I don’t know how well we achieved that so far. I like our frontpage, and I think the post-reading experience is quite exceptionally focused and clear, but I think there is still something about the way the whole site is structured, with its focus on recent content and new discussion that often makes me feel scattered when I visit the site.
I think a major problem is that Lesswrong doesn’t make it easy to do only a single focused thing on the site at a time, and it doesn’t currently really encourage you to engage with the site in a focused way. We have the library, which I do think is decent, but the sequence navigation experience is not yet fully what I would like it to be, and when I go to the frontpage the primary thing I still see is recent content. Not the sequences I recently started reading, or the practice exercises I might want to fill out, or the open questions I might want to answer.
I think ther are a variety of ways to address this, some of which I hope to build very soon:
+ The frontpage should show you not only recent content, but also show you much older historical content (that can be of much higher quality, due to being drawn from a much larger pool). [We have a working prototype of this, and I hope we can push it soon]
+ We should encourage you to read whole sequences at a time, instead of individual posts. If you start reading a sequence, you should be encouraged to continue reading it from the frontpage [This is also quite close to working]
+ There should be some way to encourage people to put serious effort into answering the most important open questions [This is currently mostly bottlenecked on making the open-question system/UX good enough to make real progress in]
+ You should be able to easily bookmark posts and comments to allow you to continue reading something at a later point in time [We haven’t really started on this, but it’s pretty straightforward, so I still think this isn’t too far off]
+ I would love it if there were real rationality exercises in many of the sequences, in a way that would periodically require you to write essays and answer questions and generally check your understanding. This is obviously quite difficult to make happen, both in terms of UI, but also in terms of generating the content
I think if we had all of these, in particular the open questions one, then I think I would feel more like LessWrong is oriented towards my long-term growth instead of trying to give me short-term reinforcement. It would also create a natural space in which to encourage focused work and generally make me feel less scattered when I visit the site, due to deemphasizing the most recent wave of content.
I do think there are problems with deemphasizing more recent content, mostly because this indirectly disincentivizes creating new content, which I do think would obviously be bad for the site. Though in some sense it might encourage the creation of longer-lived content, which would be quite good for the site.
- mako yass 4 May 2019 23:19 UTC
  4 points
  0
  Parent
  The frontpage should show you not only recent content, but also show you much older historical content
  When I was a starry eyed undergrad, I liked to imagine that reddit might resurrect old posts if they gained renewed interest, if someone rediscovered something and gave it a hard upvote, that would put it in front of more judges, which might lead to a cascade of re-approval that hoists the post back into the spotlight. There would be no need for reposts, evergreen content would get due recognition, a post wouldn’t be done until the interest of the subreddit (or, generally, user cohort) is really gone.
  Of course, reddit doesn’t do that at all. Along with the fact that threads are locked after a year, this is one of many reasons it’s hard to justify putting a lot of time into writing for reddit.
habryka 27 Apr 2019 19:28 UTC
23 points
0
Thoughts on negative karma notifications:
- An interesting thing that I and some other people on the LessWrong team noticed (as well as some users) was that since we created karma notifications we feel a lot more hesitant to downvote older comments, since we know that this will show up for the other users as a negative notification. I also feel a lot more hesitant to retract my own strong upvotes or upvotes in general since the author of the comment will see that as a downvote.
- I’ve had many days in a row in which I received +20 or +30 karma, followed by a single day where by chance I received a single downvote and ended up at −2. The emotional valence of having a single day at −2 was somehow stronger than the emotional valence of multiple days of +20 or +30.
What links here?
- [Meta] Hiding negative karma notifications by default by habryka (4 May 2019 2:36 UTC; 26 points)
- Jan_Kulveit 29 Apr 2019 19:47 UTC
  9 points
  0
  Parent
  What I noticed on the EA forum is the whole karma thing is messing up with my S1 processes and makes me unhappy on average. I’ve not only turned off the notifications, but also hidden all karma displays in comments via css, and the experience is much better.
  - habryka 29 Apr 2019 20:41 UTC
    5 points
    0
    Parent
    I… feel conflicted about people deactivating the display of karma on their own comments. In many ways karma (and downvotes in particular) serve as a really important feedback source, and I generally think that people who reliably get downvoted should change how they are commenting, and them not doing so usually comes at high cost. I think this is more relevant to new users, but is still relevant for most users.
    Deactivating karma displays feels a bit to me like someone who shows up at a party and says “I am not going to listen to any subtle social feedback that people might give me about my behavior, and I will just do things until someone explicitly tells me to stop”, which I think is sometimes the correct behavior and has some good properties in terms of encouraging diversity of discussion, but I also expect that this can have some pretty large negative impact on the trust and quality of the social atmosphere.
    On the other hand, I want people to have control over the incentives that they are under, and think it’s important to give users a lot of control over how they want to be influenced by the platform.
    And there is also the additional thing, which is that if users just deactivate the karma display for their comments without telling anyone then that creates an environment of ambiguity where it’s very unclear whether someone receives the feedback you are giving them at all. In the party metaphor this would be like showing up and not telling anyone that you are not going to listen to subtle social feedback, which I think can easily lead to unnecessary escalation of conflict.
    I don’t have a considered opinion on what to incentivize here, besides being pretty confident that I wouldn’t want most people to deactivate their karma displays, and that I am glad that you told me here that you did. This means that I will err on the side of leaving feedback by replying in addition to voting (though this obviously comes at a significant cost to me, so it might be game theoretically better for me to not shift towards replying, but I am not sure of that. Will think more about it).
    There are also some common-knowledge effects that get really weird when one person is interacting with the discussion with a different set of data than I am seeing. I.e. I am going to reply to a downvoted comment in a way that assumes that many people thought the comment was bad and will try to explain potential reasons for why people might have downvoted it, but if you have karma displays disabled then you might perceive me as making a kind of social attack where I claim the support of some kind of social group without backing it up. I think this makes me quite hesitant to participate in discussions with that kind of weird information asymmetry.
    - Said Achmiz 29 Apr 2019 21:12 UTC
      6 points
      0
      Parent
      Well… you can’t actually stop people from activating custom CSS that hides karma values. It doesn’t matter how you feel about it—you can’t affect it! It’s therefore probably best to create some mechanism that gives people what they want to get out of hiding karma, while still giving you what you want out of showing people karma (e.g., a “hide karma but give me a notification if one of my comments is quite strongly downvoted” option—not suggesting this exact thing, just brainstorming…).
      - habryka 29 Apr 2019 21:49 UTC
        4 points
        0
        Parent
        Hmm, I agree that I can’t prevent it in that sense, but I think defaults matter a lot here, as does just normal social feedback and whatever the social norms are.
        It’s not at all clear to me that the current equilibrium isn’t pretty decent, where people can do it, but it’s reasonably inconvenient to do it, and so allows the people who are disproportionately negatively affected by karma notification to go that route. I would be curious in whether there are any others who do the same as Jan does, and if there are many, then we can figure out what the common motivations are and see whether it makes sense to elevate it to some site-level feature.
        Said Achmiz 29 Apr 2019 22:16 UTC
        6 points
        0
        Parent
        
        It’s not at all clear to me that the current equilibrium isn’t pretty decent, where people can do it, but it’s reasonably inconvenient to do it, and so allows the people who are disproportionately negatively affected by karma notification to go that route.
        
        But this is an extremely fragile equilibrium. It can be broken by, say, someone posting a set of simple instructions on how to do this. For instance:
        
        Anyone running the uBlock Origin browser extension can append several lines to their “My Filters” tab in the uBlock extension preferences, and thus totally hide all karma-related UI elements on Less Wrong. (PM me if you want the specific lines to append.)
        
        Or someone makes a browser extension to do this. Or a user style. Or…
        
        Jan_Kulveit 30 Apr 2019 2:37 UTC
        5 points
        0
        Parent
        FWIW I also think it’s quite possible the current equilibrium is decent (which is part of reasons why I did not posted something like “How did I turned karma off” with simple instruction about how to do it on the forum, which I did consider). On the other hand I’d be curious about more people trying it and reporting their experiences.
        I suspect many people kind of don’t have this action in the space of things they usually consider—I’d expect what most people would do is 1) just stop posting 2) write about their negative experience 3) complain privately.
    - Jan_Kulveit 30 Apr 2019 2:29 UTC
      3 points
      0
      Parent
      Actually I turned the karma for all comments, not just mine. The bold claim is my individual taste in what’s good on the EA forum is in important ways better than the karma system, and the karma signal is similar to sounds made by a noisy mob. If I want I can actually predict what average sounds will the crowd make reasonably well, so it is not any new source of information. But it still messes up with your S1 processing and motivations.
      Continuing with the party metaphor, I think it is generally not that difficult to understand what sort of behaviour will make you popular at a party, and what sort of behaviours even when they are quite good in a broader scheme of things will make you unpopular at parties. Also personally I often feel something like “I actually want to have good conversations about juicy topics in a quite place, unfortunately you all people are congregating at this super loud space, with all these status games, social signals, and ethically problematic norms how to treat other people” toward most parties.
      Overall I posted this here because it seemed like an interesting datapoint. Generally I think it would be great if people moved toward writing information rich feedback instead of voting, so such shift seems good. From what I’ve seen on EA forum it’s quite rarely “many people” doing anything. More often it is like 6 users upvote a comment, 1user strongly downvotes it, something like karma 2 is a result. I would guess you may be in larger risk of distorted perception that this represents some meaningful opinion of the community. (Also I see some important practical cases where people are misled by “noises of the crowd” and it influences them in a harmful way.)
- Zvi 28 Apr 2019 23:32 UTC
  8 points
  1
  Parent
  If people are checking karma changes constantly and getting emotional validation or pain from the result, that seems like a bad result. And yes, the whole ‘one −2 and three +17s feels like everyone hates me’ thing is real, can confirm.
  - habryka 29 Apr 2019 0:21 UTC
    5 points
    0
    Parent
    Because of the way we do batching you can’t check karma changes constantly (unless you go out of your way to change your setting) because we batch karma notifications on a 24h basis by default.
    - DanielFilan 30 Apr 2019 18:36 UTC
      4 points
      0
      Parent
      I mean, you can definitely check your karma multiple times a day to see where the last two sig digits are at, which is something I sometimes do.
      - habryka 30 Apr 2019 18:40 UTC
        3 points
        0
        Parent
        True. We did very intentionally avoid putting your total karma on the frontpage anywhere as most other platforms do to avoid people getting sucked into that unintentionally, but it you can still do that on your profile.
        I hope we aren’t wasting a lot of people’s time by causing them to check their profile all the time. If we do, it might be the correct choice to also only update that number every 24h.
        Rob Bensinger 30 Apr 2019 23:17 UTC
        2 points
        0
        Parent
        I’ve never checked my karma total on LW 2.0 to see how it’s changed.
        DanielFilan 30 Apr 2019 21:40 UTC
        2 points
        0
        Parent
        In my case, it sure feels like I check my karma often because I often want to know what my karma is, but maybe others differ.
    - Ben Pace 29 Apr 2019 1:01 UTC
      3 points
      0
      Parent
      Do our karma karma notifications disappear if you don’t check them that day? My model of Zvi suggested to me this is attention-grabbing and bad. I wonder if it’s better to let folks be notified of all days’ karma updates ’til their most recent check in, and maybe also see all historical ones ordered by date if they click on a further button, so that the info isn’t lost and doesn’t feel scarce.
      - habryka 29 Apr 2019 1:26 UTC
        4 points
        0
        Parent
        Nah, they accumulate until you click on them.
        Zvi 29 Apr 2019 12:08 UTC
        8 points
        0
        Parent
        Which is definitely better than it expiring, and 24h batching is better than instantaneous feedback (unless you were going to check posts individually for information already, in which case things are already quite bad). It’s not obvious to me what encouraging daily checks here is doing for discourse as opposed to being a Skinner box.
        Raemon 29 Apr 2019 20:04 UTC
        10 points
        0
        Parent
        The motivation was (among other things) several people saying to us “yo, I wish LessWrong was a bit more of a skinner box because right now it’s so throughly not a skinner box that it just doesn’t make it into my habits, and I endorse it being a stronger habit than it currently is.”
        See this comment and thread.
- Shmi 27 Apr 2019 20:27 UTC
  6 points
  0
  Parent
  It’s interesting to see how people’s votes on a post or comment are affected by other comments. I’ve noticed that a burst of vote count changes often appears after a new and apparently influential reply shows up.
- Alexei 27 Apr 2019 19:38 UTC
  4 points
  0
  Parent
  Yeah, I had the same occurrence + feeling recently when I wrote the quant trading post. It felt like: “Wait, who would downvote this post...??” It’s probably more likely that someone just retracted an upvote.
- mako yass 28 Apr 2019 3:21 UTC
  0 points
  −3
  Parent
  Reminder: If a person is not willing to explain their voting decisions, you are under no obligation to waste cognition trying to figure them out. They don’t deserve that. They probably don’t even want that.
  - Vladimir_Nesov 4 May 2019 14:55 UTC
    10 points
    0
    Parent
    That depends on what norm is in place. If the norm is to explain downvoting, then people should explain, otherwise there is no issue in not doing so. So the claim you are making is that the norm should be for people to explain. The well-known counterargument is that this disincentivizes downvoting.
    
    you are under no obligation to waste cognition trying to figure them out
    
    There is rarely an obligation to understand things, but healthy curiosity ensures progress on recurring events, irrespective of morality of their origin. If an obligation would force you to actually waste cognition, don’t accept it!
    - mako yass 5 May 2019 9:07 UTC
      1 point
      0
      Parent
      So the claim you are making is that the norm should be for people to explain
      I’m not really making that claim. A person doesn’t have to do anything condemnable to be in a state of not deserving something. If I don’t pay the baker, I don’t deserve a bun. I am fine with not deserving a bun, as I have already eaten.
      The baker shouldn’t feel like I am owed a bun.
      Another metaphor is that the person who is beaten on the street by silent, masked assailants should not feel like they owe their oppressors an apology.
  - Said Achmiz 28 Apr 2019 3:29 UTC
    4 points
    0
    Parent
    Do you mean anything by this beyond “you don’t have an obligation to figure out why people voted one way or another, period”? (Or do you think that I [i.e., the general Less Wrong commenter] do have such an obligation?)
    
    Edit: Also, the “They don’t deserve that” bit confuses me. Are you suggesting that understanding why people upvoted or downvoted your comment is a favor that you are doing for them?
    - mako yass 28 Apr 2019 5:44 UTC
      2 points
      0
      Parent
      Sometimes a person wont want to reply and say outright that they thought the comment was bad, because it’s just not pleasant, and perhaps not necessary. Instead, they might just reply with information that they think you might be missing, which you could use to improve, if you chose to. With them, an engaged interlocutor will be able to figure out what isn’t being said. With them, it can be productive to try to read between the lines.
      Are you suggesting that understanding why people upvoted or downvoted your comment is a favor that you are doing for them?
      Isn’t everything relating to writing good comments a favor, that you are doing for others. But I don’t really think in terms of favors. All I mean to say is that we should write our comments for the sorts of people who give feedback. Those are the good people. Those are the people who’re a part of a good faith self-improving discourse. Their outgroup are maybe not so good, and we probably shouldn’t try to write for their sake.
  - habryka 28 Apr 2019 6:13 UTC
    3 points
    0
    Parent
    I think I disagree. If you are getting downvoted by 5 people and one of them explains why, then even if the other 4 are not explaining their reasoning it’s often reasonable to assume that more than just the one person had the same complaints, and as such you likely want to update more that it’s better for you to change what you are doing.
    - mako yass 28 Apr 2019 21:47 UTC
      6 points
      0
      Parent
      We don’t disagree.
      - habryka 28 Apr 2019 22:02 UTC
        4 points
        0
        Parent
        Cool
habryka 23 Nov 2024 5:50 UTC
22 points
2
Here is a thing that I think would be cool to analyze sometime: How difficult would it have been for AI systems to discover and leverage historical hardware-level vulnerabilities, assuming we had not discovered them yet. Like, it seems worth an analysis to understand how difficult things like rowhammer, or more recent speculative execution bugs like Spectre and Meltdown would have been to discover, and how useful they would have been. It’s not an easy analysis, but I can imagine the answer coming out obviously one way or another if one engaged seriously with the underlying issue.
- MondSemmel 23 Nov 2024 10:07 UTC
  6 points
  0
  Parent
  How would you avoid the data contamination issue where the AI system has been trained on the entire Internet and thus already knows about all of these vulnerabilities?
  - Marcus Williams 23 Nov 2024 16:03 UTC
    3 points
    2
    Parent
    I suppose you could use models trained before vulnerabilities happen?
    - Archimedes 24 Nov 2024 21:06 UTC
      1 point
      0
      Parent
      Aren’t most of these famous vulnerabilities from before modern LLMs existed and thus part of their training data?
      - Marcus Williams 24 Nov 2024 21:24 UTC
        1 point
        0
        Parent
        Sure, but does a vulnerability need to be famous to be useful information? I imagine there are many vulnerabilities on a spectrum from minor to severe and from almost unknown to famous?
- Yudhister Kumar [Deprecated] 23 Nov 2024 7:31 UTC
  3 points
  2
  Parent
  (very naive take) I would suspect this is medium-easily automatable by making detailed enough specs of existing hardware systems & bugs in them, or whatever (maybe synthetically generate weak systems with semi-obvious bugs and train on transcripts which allows generalization to harder ones). it also seems like the sort of thing that is particularly susceptible to AI >> human; the difficulty here is generating the appropriate data & the languages for doing so already exist ?
- lc 23 Nov 2024 21:40 UTC
  2 points
  0
  Parent
  Why hardware bugs in particular?
- gyfwehbdkch 23 Nov 2024 16:20 UTC
  1 point
  0
  Parent
  Can AI hack into LessWrong’s database?
  
  This seems like a strictly easier task than discovering rowhammer or spectre.
  (The hard part is discovering the vulnerability, not writing the code for the exploit assuming you had a one paragraph description.)
  Have you read the wikipedia pages for these attacks? My intuition is they require first principles thinking to discover, you’re unlikely to stumble on them simply by generating a lot of data from the processor and searching for patterns in the data.
habryka 13 Sep 2019 4:57 UTC
20 points
0
Thoughts on impact measures and making AI traps
I was chatting with Turntrout today about impact measures, and ended up making some points that I think are good to write up more generally.
One of the primary reasons why I am usually unexcited about impact measures is that I have a sense that they often “push the confusion into a corner” in a way that actually makes solving the problem harder. As a concrete example, I think a bunch of naive impact regularization metrics basically end up shunting the problem of “get an AI to do what we want” into the problem of “prevent the agent from interferring with other actors in the system”.
The second one sounds easier, but mostly just turns out to also require a coherent concept and reference of human preferences to resolve, and you got very little from pushing the problem around that way, and sometimes get a false sense of security because the problem appears to be solved in some of the toy problems you constructed.
I am definitely concerned that Turntrou’s AUP does the same, just in a more complicated way, but am a bit more optimistic than that, mostly because I do have a sense that in the AUP case there is actually some meaningful reduction going on, though I am unsure how much.
In the context of thinking about impact measures, I’ve also recently been thinking about the degree to which “trap-thinking” is actually useful for AI Alignment research. I think Eliezer was right in pointing out that a lot of people, when first considering the problem of unaligned AI, end up proposing some kind of simple solution like “just make it into an oracle” and then consider the problem solved.
I think he is right that it is extremely dangerous to consider the problem solved after solutions of this type, but it isn’t obvious that there isn’t some good work that can be done that is born out of the frame of “how can I trap the AI and make it marginally harder for it to be dangerous, basically pretending it’s just a slightly smarter human?”.
Obviously those kinds of efforts won’t solve the problem, but they still seem like good things to do anyways, even if they just buy you a bit of time, or help you notice a bit earlier if your AI is actually engaging in some kind of adversarial modeling.
My broad guess is that research of this type is likely very cheap and much more scalable, and you hit diminishing marginal returns on it much faster than you would on AI Alignment research that is tackling the core problem, so it might just be fine to punt it until later. Though if you are acting on very short timelines it probably should still be someones job to make sure that someone at Deepmind tries to develop the obvious transparency technologies to help you spot if your neural net has any large fractions of it dedicated to building sophisticated human modeling, even if this won’t solve the problem in the long-run.
This perspective, combined with Wei Dai’s recent comments that one job of AI Alignment researchers is to produce evidence that the problem is actually difficult, is that it might be a good idea for some people to just try to develop lots of benchmarks of adversarial behavior that have any chance of triggering before you have a catastrophic failure. Like, it seems obviously great to have a paper that takes some modern ML architecture and can clearly demonstrate in which cases it might engage in adversarial modeling, and maybe some remotely realistic scenarios where that might happen.
My current guess is that current ML architectures aren’t really capable of adversarial modeling in this way, though I am not actually that confident of that, and actually would be somewhat surprised if you couldn’t get any adversarial behavior out of a dedicated training regime, if you were to try. For example, let’s say I train an RL-based AI architecture on chat interactions with humans in which it just tries to prolong the length of the chat session as much as possible. I would be surprised if the AI wouldn’t build pretty sophisticated models of human interactions, and try some weird tactics like get the human to believe that it is another human, or pretend that it is performing some long calculation, or deceive the humans in a large variety of ways, at least if it was pretrained with a language model of comparable quality to GPT-2, and had similar resources going to it as Open AI Five. Though it’s also unclear to what degree this would actually give us evidence about treacherous turn scenarios.
I’ve also been quite curious about the application of ML to computer security, where an obvious experiment is to just try to set up some reasonable RL-architecture in which I have an AI interface with a webserver, trying to get access to some set of files that it shouldn’t get access to . The problem here is obviously the sparse reward landscape, and there really isn’t an obvious training regime here, but showing how even current AI could possibly leverage security vulnerabilities in a lot of systems in a way that could easily give rise to unintented side-effects could be a valuable goal. But in general training RL for almost anything is really hard, so this seems unlikely to work straightforwardly.
Overall, I am not sure what I feel about the perspective I am exploring above. I have a deep sense that a lot of it is just trying to dodge the hard parts of the problem, but it seems fine to put on my hat for short-term, increase marginal difficulty of bad outcomes, for a bit and see how I feel after exploring it for a while.
- Matthew Barnett 13 Sep 2019 7:20 UTC
  7 points
  0
  Parent
  [ETA: This isn’t a direct reply to the content in your post. I just object to your framing of impact measures, so I want to put my own framing in here]
  I tend to think that impact measures are just tools in a toolkit. I don’t focus on arguments of the type “We just need to use an impact measure and the world is saved” because this indeed would be diverting attention from important confusion. Arguments for not working on them are instead more akin to saying “This tool won’t be very useful for building safe value aligned agents in the long run.” I think that this is probably true if we are looking to build aligned systems that are competitive with unaligned systems. By definition, an impact penalty can only limit the capabilities of a system, and therefore does not help us to build powerful aligned systems.
  To the extent that they meaningfully make cognitive reductions, this is much more difficult for me to analyze. On one hand, I can see a straightforward case for everyone being on the same page when the word “impact” is used. On the other hand, I’m skeptical that this terminology will meaningfully input into future machine learning research.
  The above two things are my main critiques of impact measures personally.
- TurnTrout 20 Sep 2019 23:58 UTC
  4 points
  0
  Parent
  I think a natural way of approaching impact measures is asking “how do I stop a smart unaligned AI from hurting me?” and patching hole after hole. This is really, really, really not the way to go about things. I think I might be equally concerned and pessimistic about the thing you’re thinking of.
  
  The reason I’ve spent enormous effort on Reframing Impact is that the impact-measures-as-traps framing is wrong! The research program I have in mind is: let’s understand instrumental convergence on a gears level. Let’s understand why instrumental convergence tends to be bad on a gears level. Let’s understand the incentives so well that we can design an unaligned AI which doesn’t cause disaster by default.
  
  The worst-case outcome is that we have a theorem characterizing when and why instrumental convergence arises, but find out that you can’t obviously avoid disaster-by-default without aligning the actual goal. This seems pretty darn good to me.
habryka 2 May 2019 19:05 UTC
19 points
0
Printing more rationality books: I’ve been quite impressed with the success of the printed copies of R:A-Z and think we should invest resources into printing more of the other best writing that has been posted on LessWrong and the broad diaspora.
I think a Codex book would be amazing, but I think there also exists potential for printing smaller books on things like Slack/Sabbath/etc., and many other topics that have received a lot of other coverage over the years. I would also be really excited about printing HPMOR, though that has some copyright complications to it.
My current model is that there exist many people interested in rationality who don’t like reading longform things on the internet and are much more likely to read things when they are in printed form. I also think there is a lot of value in organizing writing into book formats. There is also the benefit that the book now becomes a potential gift for someone else to read, which I think is a pretty common way ideas spread.
I have some plans to try to compile some book-length sequences of LessWrong content and see whether we can get things printed (obviously in coordination with the authors of the relevant pieces).
What links here?
- habryka's comment on The LessWrong 2018 Book is Available for Pre-order by Ben Pace (3 Dec 2020 3:20 UTC; 7 points)
- DanielFilan 2 Dec 2020 16:31 UTC
  5 points
  0
  Parent
  Congratulations! Apparently it worked!
habryka 30 Apr 2019 18:28 UTC
19 points
0
Forecasting on LessWrong: I’ve been thinking for quite a while about somehow integrating forecasts and prediction-market like stuff into LessWrong. Arbital has these small forecasting boxes that look like this:

I generally liked these, and think they provided a good amount of value to the platform. I think our implementation would probably take up less space, but the broad gist of Arbital’s implementation seems like a good first pass.
I do also have some concerns about forecasting and prediction markets. In particular I have a sense that philosophical and mathematical progress only rarely benefits from attaching concrete probabilities to things, and more works via mathematical proof and trying to achieve very high confidence on some simple claims by ruling out all other interpretations as obviously contradictory. I am worried that emphasizing probability much more on the site would make making progress on those kinds of issues harder.
I also think a lot of intellectual progress is primarily ontological, and given my experience with existing forecasting platforms and Zvi’s sequence on prediction markets, they are not very good at resolving ontological confusions and often seem to actively hinder them by causing lots of sunk-costs into easy-to-operationalize ontologies that tend to dominate the platforms.
And then there is the question of whether we want to go full-on internal prediction market and have active markets that are traded in some kind of virtual currency that people actually care about. I think there is a lot of value in that direction, but it’s obviously also a lot of engineering effort that isn’t obviously worth it. It seems likely better to wait until a project like foretold.io has reached maturity and then see whether we can integrate it into LessWrong somehow.
What links here?
- JP Addison🔸's comment on EA Forum feature suggestion thread by Aaron Gertler 🔸 (EA Forum; 24 Jun 2020 18:14 UTC; 2 points)
- Zvi 3 May 2019 18:34 UTC
  20 points
  0
  Parent
  This feature is important to me. It might turn out to be a dud, but I would be excited to experiment with it. If it was available in a way that was portable to other websites as well, that would be even more exciting to me (e.g. I could do this in my base blog).
  Note that this feature can be used for more than forecasting. One key use case on Arbital was to see who was willing to endorse or disagree with, to what extent, various claims relevant to the post. That seemed very useful.
  I don’t think having internal betting markets is going to add enough value to justify the costs involved. Especially since it both can’t be real money (for legal reasons, etc) and can’t not be real money if it’s going to do what it needs to do.
  - habryka 3 May 2019 19:08 UTC
    6 points
    0
    Parent
    There are some external platforms that one could integrate with, here is one that is run by some EA-adjacent people: https://www.empiricast.com/
    I am currently confused about whether using an external service is a good idea. In some sense it makes things mode modular, but it also limits the UI design-space a lot and lengthens the feedback loop. I think I am currently tending towards rolling our own solution and maybe allowing others to integrate it into their site.
- Rob Bensinger 30 Apr 2019 23:35 UTC
  4 points
  0
  Parent
  One small thing you could do is to have probability tools be collapsed by default on any AIAF posts (and maybe even on the LW versions of AIAF posts).
  Also, maybe someone should write a blog post that’s a canonical reference for ‘the relevant risks of using probabilities that haven’t already been written up’, in advance of the feature being released. Then you could just link to that a bunch. (Maybe even include it in the post that explains how the probability tools work, and/or link to that post from all instances of the probability tool.)
  Another idea: Arbital had a mix of (1) ‘specialized pages that just include a single probability poll and nothing else’; (2) ‘pages that are mainly just about listing a ton of probability polls’; and (3) ‘pages that have a bunch of other content but incidentally include some probability polls’.
  If probability polls on LW mostly looked like 1 and 2 rather than 3, then that might make it easier to distinguish the parts of LW that should be very probability-focused from the parts that shouldn’t. I.e., you could avoid adding Arbital’s feature for easily embedding probability polls in arbitrary posts (and/or arbitrary comments), and instead treat this more as a distinct kind of page, like ‘Questions’.
  You could still link to the ‘Probability’ pages prominently in your post, but the reduced prominence and site support might cause there to be less social pressure for people to avoid writing/posting things out of fears like ‘if I don’t provide probability assignments for all my claims in this blog post, or don’t add a probability poll about something at the end, will I be seen as a Bad Rationalist?’
  - Rob Bensinger 30 Apr 2019 23:36 UTC
    5 points
    0
    Parent
    Also, if you do something Arbital-like, I’d find it valuable if the interface encourages people to keep updating their probabilities later as they change. E.g., some (preferably optional) way of tracking how your view has changed over time. Probably also make it easy for people to re-vote without checking (and getting anchored by) their old probability assignment, for people who want that.
    - Ben Pace 1 May 2019 1:03 UTC
      14 points
      0
      Parent
      Note that Paul Christiano warns against encouraging sluggish updating by massively publicising people’s updates and judging them on it. Not sure what implementation details this suggests yet, but I do want to think about it.
      
      https://sideways-view.com/2018/07/12/epistemic-incentives-and-sluggish-updating/
      - Rob Bensinger 1 May 2019 18:06 UTC
        4 points
        0
        Parent
        Yeah, strong upvote to this point. Having an Arbital-style system where people’s probabilities aren’t prominently timestamped might be the worst of both worlds, though, since it discourages updating and makes it look like most people never do it.
        I have an intuition that something socially good might be achieved by seeing high-status rationalists treat ass numbers as ass numbers, brazenly assign wildly different probabilities to the same proposition week-by-week, etc., especially if this is a casual and incidental thing rather than being the focus of any blog posts or comments. This might work better, though, if the earlier probabilities vanish by default and only show up again if the user decides to highlight them.
        (Also, if a user repeatedly abuses this feature to look a lot more accurate than they really were, this warrants mod intervention IMO.)
habryka 15 Aug 2024 23:48 UTC
18 points
4
We are rolling out some new designs for the post page:
Old:
New:
The key goal was to prioritize the most important information and declutter the page.
The most opinionated choice I made was to substantially de-emphasize karma at the top of the post page. I am not totally sure whether that is the right choice, but I think the primary purpose of karma is to use it to decide what to read before you click on a post, which makes it less important to be super prominent when you are already on a post page, or when you are following a link from some external website.
The bottom of the post still has very prominent karma UI to make it easy for people to vote after they finished reading a post (and to calibrate on reception before reading the comments).
This redesign also gives us more space in the right column, which we will soon be filling with new side-note UI and an improved inline-react experience.
The mobile UI is mostly left the same, though we did make the decision to remove post-tags from the top of the mobile UI page to only making them visible below the post, because they took up too much space.
Feel free to comment here with feedback. I expect we will be iterating on the new design quite actively over the coming days and weeks.
What links here?
- Arbital has been imported to LessWrong by RobertM (20 Feb 2025 0:47 UTC; 288 points)
- Neel Nanda 16 Aug 2024 7:22 UTC
  25 points
  33
  Parent
  I really don’t like the removal of the comment counter at the top, because that gave a link to skip to the comments. I fairly often want to skip immediately to the comments to eg get a vibe for if the post is worth reading, and having a one click skip to it is super useful, not having that feels like a major degradation to me
  - habryka 16 Aug 2024 7:43 UTC
    6 points
    2
    Parent
    The link is now on the bottom left of the screen, and in contrast to the previous design should consistently be always in the same location (whereas its previous position depended on how long the username is and some other details). I also care quite a bit about a single-click navigate to the comments.
    - Neel Nanda 16 Aug 2024 8:55 UTC
      10 points
      4
      Parent
      Ah! Hmm, that’s a lot better than nothing, but pretty out of the way, and easy to miss. Maybe making it a bit bigger or darker, or bolding it? I do like the fact that it’s always there as you scroll
    - Zach Stein-Perlman 17 Aug 2024 2:23 UTC
      4 points
      0
      Parent
      I can’t jump to the comments on my phone.
      - habryka 17 Aug 2024 2:25 UTC
        4 points
        0
        Parent
        Ah, oops, that’s actually just a bug. Will fix.
    - StefanHex 2 Sep 2024 14:15 UTC
      2 points
      0
      Parent
      Even after reading this (2 weeks ago), I today couldn’t manage to find the comment link and manually scrolled down. I later noticed it (at the bottom left) but it’s so far away from everything else. I think putting it somewhere at the top near the rest of the UI would be much easier for me
      - habryka 2 Sep 2024 18:19 UTC
        4 points
        2
        Parent
        Yeah, we’ll probably make that adjustment soon. I also currently think the comment link is too hidden, even after trying to get used to it for a while.
- MondSemmel 16 Aug 2024 5:41 UTC
  11 points
  8
  Parent
  My impression: The new design looks terrible. There’s suddenly tons of pointless whitespace everywhere. Also, I’m very often the first or only person to tag articles, and if the tagging button is so inconvenient to reach, I’m not going to do that.
  Until I saw this shortform, I was sure this was a Firefox bug, not a conscious design decision.
  - habryka 16 Aug 2024 5:48 UTC
    2 points
    2
    Parent
    The total amount of whitespace is actually surprisingly similar to the previous design, we just now actually make use of the right column and top-right corner. I think we currently lose like 1-2 lines of text depending on the exact screen size and number of tags, so it’s roughly the same amount of total content and whitespace, but with the title and author emphasized a lot more.
    I am sad about making the add-tag button less prominent for people who tag stuff, but it’s only used by <1% of users or so, and so not really worth the prominent screen estate where it was previously. I somewhat wonder whether we might be able to make it work by putting it to the left of the tagging list, where it might be able to fade somewhat more into the background while still being available. The previous tag UI was IMO kind of atrocious and took up a huge amount of screen real-estate, but am not super confident the current UI is ideal (especially from the perspective of adding tags).
    - sunwillrise 16 Aug 2024 7:26 UTC
      6 points
      0
      Parent
      I am sad about making the add-tag button less prominent for people who tag stuff, but it’s only used by <1% of users or so, and so not really worth the prominent screen estate where it was previously.
      I really don’t understand the reasoning here. As I see it, tagging is a LW public good that is currently undersupplied, and the “prominent screen estate” is pretty much the only reason it is not even more undersupplied. “We have this feature that users can use to make the site better for everyone, but it’s not being used as much as we’d want to, so it’s not such a big deal if we make it less prominent” seems backwards to me; the solution would seem to make it even more prominent, no? With a subgoal of increasing the proportion of “people who tag stuff” to be much more than 1%.
      Let’s make this more concrete: does LW not already suffer from the problem that too few people regularly tag posts (at least with the requisite degree of care)? As a mod, you should definitely have more data on this, and perhaps you do and believe I am wrong about this, but in my experience, tags are often missing, improper, etc., until some of the commenters try (and often fail) to pick up the slack. This topic has been talked about for a long time, ever since the tagging system began, with many users suggesting that the tags be made even more prominent at the top of a post. Raemon even said, just a over a week ago:
      I notice some people go around tagging posts with every plausible tag that possible seems like it could fit. I don’t think this is a good practice – it results in an extremely overwhelming and cluttered tag-list, which you can’t quickly skim to figure out “what is this post actually about”?, and I roll to disbelieve on “stretch-tagging” actually helping people who are searching tag pages.
      And in response, Joseph Miller pointed out:
      There should probably be guidance on this when you go to add a tag. When I write a post I just randomly put some tags and have never previously considered that it might be prosocial to put more or less tags on my post.
      This certainly seems like a problem that gets solved by increasing community involvement in tagging, so that it’s not just the miscalibrated or idiosyncratic beliefs of a small minority of users that determines what gets tagged with what. And making the tags harder to notice seems like it shifts the incentives the complete opposite direction.
      - habryka 16 Aug 2024 7:50 UTC
        2 points
        0
        Parent
        Raemon even said, just a over a week ago:
        I am confused about the quote. Indeed, in that quote Ray is complaining about people tagging things too aggressively, saying basically the opposite of your previous paragraph (i.e. he is complaining that tags are currently often too prominent, look too cluttered, and some users tag too aggressively).
        My current sense is that tagging is going well and I don’t super feel like I want to increase the amount of tagging that people do (though I do think much less tagging would be bad).
        It’s also the case that tagging is the kind of task that probably has a decent chance of being substantially automated with AI systems, and indeed, if I wanted to tackle the problem of posts not being reliably tagged, I would focus on doing so in an automated way, now that LLMs are just quite good and cheap at this kind of intellectual labor. I don’t think it could fully solve the problem and would still need a bunch of human in the loop, but I think it could easily speed up tagging efficiency by 20x+. I’ve been thinking about building an auto-tagger, and might do so if we see tagging activity drop out of making these buttons less prominent.
        sunwillrise 16 Aug 2024 7:54 UTC
        2 points
        0
        Parent
        (i.e. he is complaining that tags are currently often too prominent, look too cluttered, and some users tag too aggressively).
        Right, but the point I was trying to make is that the reason why this happens is because you don’t have sufficient engagement from the broader community in this stuff, so when mistakes like these happen (maybe because the people doing the tagging are a small and unrepresentative sample of the LW userbase), they don’t get corrected quickly because there are too few people to do the correcting. Do you disagree with this?
        habryka 16 Aug 2024 7:57 UTC
        2 points
        0
        Parent
        I think it’s messy. In this case, it seems like the problem would have never appeared in the first place if the tagging button had been less available. I agree many other problems would be better addressed by having more people participate in the tagging system.
- Vladimir_Nesov 17 Aug 2024 14:09 UTC
  10 points
  8
  Parent
  The new design seems to be influenced by the idea that spreading UI elements across greater distances (reducing their local density) makes the interface less cluttered. I think it’s a little bit the other way around, shorter distances with everything in one place make it easier to chunk and navigate, but overall the effect is small either way. And the design of spreading the UI elements this way is sufficiently unusual that it will be slightly confusing to many people.
  - habryka 17 Aug 2024 17:03 UTC
    4 points
    1
    Parent
    I don’t really think that’s the primary thing going on. I think one of the key issues with the previous design was the irregularity of the layout. Everything under the header would wrap and basically be in one big jumble, with the number and length of the author names changing where the info on the number of comments is, and where the tags section starts.
    
    It also didn’t communicate a good hierarchy on which information is important. Ultimately, all you need to start reading a post is the title and the content. The current design communicates the optionality of things like karma and tags better, whereas the previous design communicates that those pieces of information might need to be understood before you start reading.
- Shankar Sivarajan 16 Aug 2024 3:38 UTC
  9 points
  7
  Parent
  The title is annoyingly large.
  I like the table of contents on the left becoming visible only upon mouseover.
- NoUsernameSelected 16 Aug 2024 2:47 UTC
  8 points
  3
  Parent
  Why remove “x min read”? Even if it’s not gonna be super accurate between different people’s reading speeds, I still found it very helpful to decide at a glance how long a post is (e.g. whether to read it on the spot or bookmark it for later).
  
  Showing the word count would also suffice.
  - habryka 16 Aug 2024 2:55 UTC
    2 points
    3
    Parent
    Mostly because there is a prior against any UI element adding complexity.
    In this case, with the new ToC progress bar which is now always visible, you can quickly glance the length of the post by checking the length of the progress bar relative to the viewport indicator. It’s an indirect inference, but I’ve gotten used to it pretty quickly. You can also still see the word count on hover-over.
    What links here?
    quila's comment on Habryka’s Shortform Feed by habryka (15 Jan 2025 1:59 UTC; 1 point)
    - Neel Nanda 16 Aug 2024 7:24 UTC
      5 points
      4
      Parent
      I find a visual indicator much less useful and harder to reason about than a number, I feel pretty sad at lacking this. How hard would it be to have as an optional addition?
      - habryka 16 Aug 2024 7:58 UTC
        2 points
        1
        Parent
        Maintaining many different design variants pretty inevitably leads to visual bugs and things being broken, so I am very hesitant to allow people to customize things at this level (almost every time we’ve done that in the past the custom UI broke in some way within a year or two, people wouldn’t complain to us, and in some cases, we would hear stories 1-2 years later that someone stopped using LW because “it started looking broken all the time”).
        We are likely shipping an update to make the reading time easier to parse in the post-hover preview to compensate some for the lack of it not being available on the post page directly. I am kind of curious in which circumstances you would end up clicking on the post page without having gotten the hover-preview first (mobile is the obvious one, though we are just adding back the reading time on mobile, that was an oversight on my part).
        Neel Nanda 16 Aug 2024 8:58 UTC
        2 points
        0
        Parent
        Typically, opening a bunch of posts that look interesting and processing them later, or being linked to a post (which is pretty common in safety research, since often a post will be linked, shared on slack, cited in a paper, etc) and wanting to get a vibe for whether I can be bothered to read it. I think this is pretty common for me.
        
        I would be satisfied if hovering over eg the date gave me info like the reading time.
        
        Another thing I just noticed: on one of my posts, it’s now higher friction to edit it, since there’s not the obvious 3 dots button (I eventually found it in the top right, but it’s pretty easy to miss and out of the way)
        habryka 16 Aug 2024 17:09 UTC
        2 points
        0
        Parent
        I would be satisfied if hovering over eg the date gave me info like the reading time.
        Oh, yeah, sure, I do think this kind of thing makes sense. I’ll look into what the most natural place for showing it on hover is (the date seems like a reasonable first guess).
        on one of my posts, it’s now higher friction to edit it, since there’s not the obvious 3 dots button (I eventually found it in the top right, but it’s pretty easy to miss and out of the way)
        I think this is really just a “any change takes some getting used to” type deal. My guess is it’s slightly easier to find for the first time than the previous design, but I am not sure. I’ll pay attention to whether new-ish users have trouble finding the triple-dot, and if so will make it more noticeable.
    - NoUsernameSelected 16 Aug 2024 3:43 UTC
      2 points
      1
      Parent
      I don’t get a progress bar on mobile (unless I’m missing it somehow), and the word count on hover feature seemingly broke on mobile as well a while ago (I remember it working before).
      - habryka 16 Aug 2024 4:27 UTC
        2 points
        0
        Parent
        Ah, I think showing something on mobile is actually a good idea. I forgot that the way we rearranged things that also went away. I will experiment with some ways of adding that information back in tomorrow.
- Olli Järviniemi 16 Aug 2024 2:28 UTC
  5 points
  0
  Parent
  I like this; I’ve found the meta-data of posts to be quite heavy and cluttered (a multi-line title, the author+reading-time+date+comments line, the tag line, a linkpost line and a “crossposted from the Aligned Forum” line is quite a lot).
  I was going to comment that “I’d like the option to look at the table-of-contents/structure”, but I then tested and indeed it displays if you hover your mouse there. I like that.
  When I open a new post, the top banner with the LessWrong link to the homepage, my username etc. show up. I’d prefer if that didn’t happen? It’s not like I want to look at the banner (which has no new info to me) when I click open a post, and hiding it would make the page less cluttered.
  - habryka 16 Aug 2024 3:05 UTC
    2 points
    0
    Parent
    When I open a new post, the top banner with the LessWrong link to the homepage, my username etc. show up. I’d prefer if that didn’t happen?
    I’ve never considered that. I do think it’s important for the banner to be there when you get linked externally, so that you can orient to where you are, but I agree it’s reasonable to hide it when you do a navigation on-site. I’ll play around a bit with this. I like the idea.
    - Screwtape 16 Aug 2024 3:30 UTC
      4 points
      2
      Parent
      Noting that I use the banner as breadcrumb navigation relatively often, clicking LessWrong to go back to the homepage or my username to get a menu and go to my drafts. The banner is useful to me as a place to reach those menus.
      
      No idea how common that use pattern is.
      - habryka 16 Aug 2024 3:31 UTC
        2 points
        0
        Parent
        Totally. The only thing that I think we would do is to start you scrolled down 64px on the post page (the height of the header), so that you would just scroll a tiny bit up and then see the header again (or scroll up anywhere and have it pop in the same way it does right now).
- Yoav Ravid 4 Sep 2024 8:23 UTC
  4 points
  7
  Parent
  I am really missing the word counter. It’s something I look at quite a lot (less so on reading time estimates, as I got used to making the estimate myself based on the wordcount).
  - quila 15 Jan 2025 1:59 UTC
    1 point
    0
    Parent
    same here, even 5 months later.
    this comment from the time says the word counter is still there but i can’t find it anywhere. (edit: nvm, it’s visible on the frontpage hoverover.)
    ( @habryka because comments on others’ comments don’t notify by default. while i’m at it, also your username doesn’t show when i enter “@habryka”)
- Alex_Altair 18 Aug 2024 16:55 UTC
  4 points
  0
  Parent
  My overall review is, seems fine, some pros and some cons, mostly looks/feels the same to me. Some details;
  - I had also started feeling like the stuff between the title and the start of the post content was cluttered.
  - I think my biggest current annoyance is the TOC on the left sidebar. This has actually disappeared for me, and I don’t see it on hover-over, which I assume is maybe just a firefox bug or something. But even before this update, I didn’t like the TOC. Specifically, you guys had made it so that there was spacing between the sections that was supposed to be proportional to the length of each section. This never felt like it worked for me (I could speculate on why if you’re interested). I’d much prefer if the TOC was just a normal outline-type thing (which it was in a previous iteration).
  - I think I’ll also miss the word count. I use it quite frequently (usually after going onto the post page itself, so the preview card wouldn’t help much). Having the TOC progress bar thing never felt like it worked either. I agree with Neel that it’d be fine to have the word count in the date hover-over, if you want to have less stuff on the page.
  - The tags at the top right are now just bare words, which I think looks funny. Over the years you guys have often seemed to prefer really naked minimalist stuff. In this case I think the tags kinda look like they might be site-wide menus, or something. I think it’s better to have the tiny box drawn around each tag as a visual cue.
  - The author name is now in a sans-serif font, which looks pretty off to me in between the title and the text as serif fonts. It looks like when the browser failed to load the site font and falls back onto the default font, or something. (I do see that it matches the fact that usernames in the comments are sans serif, though.)
  - I initially disliked the karma section being so suppressed, but then I read one of your comments in this thread explaining your reasoning behind that, and now I agree it’s good.
  - I also use the comment count/link to jump to comments fairly often, and agree that having that in the lower left is fine.
- papetoast 16 Aug 2024 8:46 UTC
  4 points
  4
  Parent
  I like most of the changes, but strongly dislike the large gap before the title. (I similarly dislike the large background in the top 50 of the year posts)
  - habryka 16 Aug 2024 17:15 UTC
    2 points
    −2
    Parent
    Well, the gap you actually want to measure is the gap between the title and the audio player (or at the very least the tags), since that’s the thing we need to make space for. You are clearly looking at LW on an extremely large screen. This is the more median experience:
    There is still a bunch of space there, but for many posts the tags extend all the way above the post.
    - papetoast 17 Aug 2024 4:23 UTC
      1 point
      0
      Parent
      I understand that having the audio player above the title is the path of least resistance, since you can’t assume there is enough space on the right to put it in. But ideally things like this should be dynamic, and only take up vertical space if you can’t put it on the right, no? (but I’m not a frontend dev)
      Alternatively, I would consider moving them vertically above the title a slight improvement. It is not great either, but at least the reason for having the gap is more obvious.
      The above screenshots are done in a 1920x1080 monitor
      - habryka 17 Aug 2024 4:56 UTC
        2 points
        1
        Parent
        Yeah, we could make things dynamic, it would just add complexity that we would need to check every time we make a change. It’s the kind of iterative improvement we might do over time, but it’s not something that should block the roll-out of a new design (and it’s often lower priority than other things, though post-pages in-particular are very important and so get a lot more attention than other pages).
- MondSemmel 22 Aug 2024 6:39 UTC
  3 points
  0
  Parent
  The new design means that I now move my mouse cursor first to the top right, and then to the bottom left, on every single new post. This UI design is bad ergonomics and feels actively hostile to users.
  - habryka 22 Aug 2024 6:44 UTC
    2 points
    0
    Parent
    I’ve been playing around with some ways to move the comment icon to the top right corner, ideally somehow removing the audio-player icon (which is much less important, but adds a lot of visual noise in a way that overwhelms the top right corner if you also add the comment icon). We’ll see whether I can get it to work.
- dirk 18 Aug 2024 20:25 UTC
  3 points
  3
  Parent
  It takes more vertical space than it used to and I don’t like that. (Also, the meatball menu is way off in the corner, which is annoying if I want to e.g. bookmark a post, though I don’t use it very often so it’s not a major inconvenience.) I think I like the new font, though!
  - dirk 23 Aug 2024 2:01 UTC
    1 point
    −2
    Parent
    Another minor annoyance I’ve since noticed, at this small scale it’s hard to to distinguish posts I’ve upvoted from posts I haven’t voted on. Maybe it’d help if the upvote indicator were made a darker shade of green or something?
    - habryka 23 Aug 2024 2:06 UTC
      2 points
      0
      Parent
      Yeah, that’s on my to-do list. I also think the current voting indicator isn’t clear enough at the shrunken size.
- Richard_Kennaway 16 Aug 2024 8:58 UTC
  3 points
  0
  Parent
  On desktop the title font is jarringly huge. I already know the title from the front page, no need to scream it at me.
  - habryka 16 Aug 2024 17:12 UTC
    7 points
    0
    Parent
    If you get linked externally (which is most of LW’s traffic), you don’t know the title (and also generally are less oriented on the page, so it helps to have a very clear information hierarchy).
    I do also agree the font is very large. I made an intentionally bold choice here for a strong stylistic effect. I do think it’s pretty intense and it might be the wrong choice, but I currently like it aesthetically a bunch.
- Ali Ahmed 16 Aug 2024 2:48 UTC
  3 points
  1
  Parent
  The new UI is great, and I agree with the thinking behind de-emphasizing karma votes at the top. It could sometimes create inherent bias and assumptions (no matter whether the karma is high or low) even before reading a post, whereas it would make more sense at the end of the post.
- Perhaps 16 Aug 2024 1:15 UTC
  3 points
  2
  Parent
  The karma buttons are too small for actions that in my experience, are done a lot more than clicking to listen to the post. It’s pretty easy to misclick.
  Additionally, it’s unclear what the tags are, as they’re no longer right beside the post to indicate their relevance.
  - habryka 16 Aug 2024 2:37 UTC
    6 points
    −2
    Parent
    The big vote buttons are at the bottom of the post, where I would prefer more of the voting to happen (I am mildly happy to discourage voting at the top of the post before you read it, though I am not confident).
- Zach Stein-Perlman 24 Sep 2024 2:09 UTC
  2 points
  0
  Parent
  I ~always want to see the outline when I first open a post and when I’m reading/skimming through it. I wish the outline appeared when-not-hover-over-ing for me.
- interstice 16 Aug 2024 5:08 UTC
  2 points
  1
  Parent
  I like the decluttering. I think the title should be smaller and have less white space above it. Also think that it would be better if the ToC was maybe just faded a lot until mouseover, the sudden appearance/disappearance feels too sudden.
  - habryka 16 Aug 2024 5:11 UTC
    4 points
    0
    Parent
    I think making things faint enough so that the relatively small margin between main body text and the ToC wouldn’t become bothersome during reading isn’t really feasible. In-general, because people’s screen-contrast and color calibration differs quite a lot, you don’t have that much wiggle room at the lower level of opacity without accidentally shipping completely different experiences to different users.
    I think it’s plausible we want to adjust the whitespace below the title, but I think you really need this much space above the title to not have it look cluttered together with the tags on smaller screens. On larger screens there is enough distance between the title and top right corner, but things end up much harder to parse when the tags extend into the space right above the title, and that margin isn’t big enough.
- quila 16 Aug 2024 0:33 UTC
  1 point
  −13
  Parent
  the primary purpose of karma is to use it to decide what to read before you click on a post, which makes it less important to be super prominent when you are already on a post page
  I think this applies to titles too
  - habryka 16 Aug 2024 1:06 UTC
    5 points
    5
    Parent
    I think the title is more important for parsing the content of an essay. Like, if a friend sends you a link, it’s important to pay a bunch of attention to the title. It’s less important that you spend attention to the karma.
habryka 15 Apr 2024 3:43 UTC
17 points
0
Had a very aggressive crawler basically DDos-ing us from a few dozen IPs for the last hour. Sorry for the slower server response times. Things should be fixed now.
habryka 27 May 2019 6:42 UTC
17 points
0
Random thoughts on game theory and what it means to be a good person

It does seem to me like there doesn’t exist any good writing on game theory from a TDT perspective. Whenever I read classical game theory, I feel like the equilibria that are being described obviously fall apart when counterfactuals are being properly brought into the mix (like D/D in prisoners dilemmas).

The obvious problem with TDT-based game theory, just as it is with Bayesian epistemology, the vast majority of direct applications are completely computationally intractable. It’s kind of obvious what should happen in games with lots of copies of yourself, but as soon as anything participates that isn’t a precise copy, everything gets a lot more confusing. So it is not fully clear what a practical game-theory literature from a TDT-perspective would look like, though maybe the existing LessWrong literature on Bayesian epistemology might be a good inspiration.

Even when you can’t fully compute everything (and we even don’t really know how to compute everything in principle), you might still be able to go through concrete scenarios and list considerations and perspectives that incorporate TDT-perspectives. I guess in that sense, a significant fraction of Zvi’s writing could be described as practical game theory, though I do think there is a lot of value in trying to formalize the theory and make things as explicit as possible, which I feel like Zvi at least doesn’t do most of the time.

Critch (Academian) tends to have this perspective of trying to figure out what a “robust agent” would do, in the sense of an agent that would at the very least be able to reliably cooperate with copies of itself, and adopt cooperation and coordination principles that allow it to achieve very good equilibria with agents that adopt the same type of cooperation and coordination norms. And I do think there is something really valuable here, though I am also worried that the part where you have to cooperate with agents who haven’t adopted super similar cooperation norms is actually the more important one (at least until something like AGI).

And I do think that the majority of the concepts we have for what it means to be a “good person” are ultimately attempts at trying to figure out how to coordinate effectively with other people, in a way that a more grounded game theory would help a lot with.

Maybe a good place to start would be to brainstorm a list of concrete situations in which I am uncertain what the correct action is. Here is some attempt at that:
- How to deal with threats of taking strongly negative-sum actions? What is the correct response to the following concrete instances?
  - A robber threatens to shoot you if you don’t hand over your wallet
    Do you precommit to violently attack any robber that robs you, or do you simply hand over your wallet?
- You are in the room with someone holding the launch buttons for the USA’s nuclear arsenal and they are threatening to launch them if you don’t hand over your wallet
- You are head of the U.S. and another nation state is threatening a small-scale nuclear attack on one of your cities if you don’t provide some kind of economic subsidy to them
  - Do you launch a conventional attack?
  - Do you launch a full out nuclear response as a deterrent?
  - Do you launch a small-scale nuclear response?
  - Do you not do anything at all?
  - Does the answer depend on the size of the economic subsidy? What if they ask twice?
- You are at a party and your assigned driver ended up drinking, even though they said they would not (the driver was chosen by a random draw)
  - Do you somehow punish them now, do you punish them later, or not at all?
  - What if they are less likely to remember if you punish them now because they are drunk? Does that matter for the game-theoretic correct action?
  - What if they did this knowingly, reasoning from a CDT perspective that there wouldn’t be any point in punishing them now because they wouldn’t remember the next day
    What if you would never see them again later?
    What if you only ever get to interact with them after they made the choice to be drunk?
I feel like I have some hint of an answer to all of these, but also feel like any answer that I can come up with makes me exploitable in a way that makes me feel like there is no meta-level on which there is an ideal strategy.
What links here?
- habryka's comment on Drowning children are rare by Benquo (28 May 2019 20:25 UTC; 43 points)
- Raemon 28 May 2019 1:11 UTC
  11 points
  0
  Parent
  Reading through this, I went “well, obviously I pay the mugger...
  ...oh, I see what you’re doing here.”
  I don’t have a full answer to the problem you’re specifying, but something that seems relevant is the question of “How much do you want to invest in the ability to punish defectors [both in terms of maximum power-to-punish, a-la nukes, and in terms of your ability to dole out fine-grained-exactly-correct punishment, a-la skilled assassins]”
  The answer to this depends on your context. And how you have answered this question determines whether it makes sense to punish people in particular contexts.
  In many cases there might want to be some amount of randomization where at least some of the time you really disproportionately punish people, but you don’t have to pay the cost of doing so every time.
  Answering a couple of the concrete questions:
  Mugger
  Right now, in real life, I’ve never been mugged, and I feel fine basically investing zero effort into preparing for being mugged. If I do get mugged, I will just hand over my wallet.
  If I was getting mugged all the time, I’d probably invest effort into a) figuring out what good policies existed for dealing with muggers, b) what costs I’d have to pay in order to implement those policies.
  In some worlds, it’s worth investing in literal body armor or bullet proof cars or whatever, and in the skill to successfully fight back against a literal mugger. (My understanding is that this usually not actually a good idea even in crime-heavy areas, but I can imagine worlds where it was correct to just get good at fighting, or to hire people who are good at fighting as bodyguards)
  In some worlds it’s worth investing more in police-force and avoiding having to think about the problem, or not carrying as much money around in the first place.
  Small Nation Demands Subsidies, Threatens Nuclear War
  Again, I think my options here depend a lot on having already invested in defense.
  One scenario is “I do not have the ability to say ‘no’ without risking millions of either my own citizens lives, or innocent citizens of the country-in-question.” In that case, I probably have to do something that makes my vague-hippie-values sad.
  I have some sense that my vague-hippie-values depend on having invested enough money in defense (and offense) that I can “afford” to be moral. Things I may wish my country had invested in include:
  - Anti-ICBM capabilities that can shoot down incoming nukes with enough reliability that either a small-scale nuclear counterstrike, or a major non-nuclear retaliatory invasion, are viable options that will at least only punish foreign civilians if the foreign government actually launches an attack
  - Possibly invested in assassins who just kill individuals who threaten nuclear strikes (I’m somewhat confused about why this isn’t more used, suspect the answer has to do with the game theory of ’the people in charge [of all nations] want it to be true that they aren’t at risk of getting assassinated, so they have a gentleman’s agreement to avoid killing enemy leaders)
  So I probably want to invest a lot in either having strong capabilities in those domains, or having allies who do.
  Drinking
  In real life I expect that the solution here is “I never invite said person to parties again, and depending on our relative social standing I might publicly badmouth them or quietly gossip about them.”
  In weird contrived scenarios I’m not sure what I do because I don’t know how to anticipate weird contrived scenarios.
  I do invest, generally, on communicating about how obviously people should follow up on their commitments, such that when someone fails to live up to their commitment, it costs less to punish them for doing so. (And this is a shared social good that multiple people invest in).
  If I’m in a one-off interaction with someone who is currently too drunk to remember being punished and who I’m not socially connected to, I probably treat it like being mugged – a fluke event that doesn’t happen often enough to be worth investing resources in being able to handle better.
  Extra Example: Having to Stand Up to a Boss/High-Status-Person/Organization
  A situation that I’m more likely to run into, where the problem actually seems hard, is that sometimes high status people do bad things, and they have more power than you, and people will naturally end up on their side and take their word over yours.
  Sort of similar to the “Small nation threatening nuclear war”, I think if you want to be able to “afford to actually have moral principles”, you need to invest upfront in capabilities. This isn’t always the right thing to do, depending on your life circumstances, but it may be sometimes. You want to have enough surplus power that you have the Slack to stand up for things.
  Possibilities include investing in being directly high status yourself, or investing in making friends with a strong enough coalition of people to punish high status people, or encourage strong norms and rule of law such that you don’t need to have as strong a coalition, because you’ve made it lower cost to socially attack someone who breaks a norm.
  Extra Example: The Crazy House Guest
  Perhaps related to the drinking example: a couple times, I’ve had people show at former houses, potentially looking to move in, and then causing some kind of harm.
  In one case, they had a very weird combination of mental illnesses and cluelessness that resulted in them dealing several thousands of dollars worth of physical damage to the house.
  They seemed crazy and unpredictable enough that it seemed like if I tried to punish them, they might follow me around forever and make my life suck in weird ways.
  So I didn’t punish them and they got away with it and went away and I never heard from them again.
  So… sure, you can get away with certain kinds of things by signaling insanity and unpredictability… but at the cost of not being welcome in major social networks. The few extra thousand dollars they saved was not remotely worth the fact that, had they been a more reasonable person, they’d have had access to a strong network of friends and houses that help each other out finding jobs and places to live and what-not.
  So I’m not worried about the longterm incentives here – the only people for whom insanity is a cost-effective tool to avoid punishment are actual insane people who don’t have the ability to interface with society normally.
  What if there turn out to be lots of crazy people? Then you probably either invest upfront resources in fighting this somehow, or become less trusting.
  Extra Example: The Greedy Landlord
  In another housing situation, the landlord tried to charge us extra for things that were not our fault. In this case, it was reasonably clear that we were in the right. Going to small claims court would have been net-negative for us, but also costly to them.
  I was angry and full of zealous energy and I decided it was worth it and I threatened going to small claims court and wasting both of our time, even though a few hundred dollars wasn’t really worth it.
  They backed down.
  This seems like the system working as intended. This is what anger is for, to make sure people have the backbone to defend themselves, and to live in a world where at least some of the time people will get riled up and punish you disproportionately
  What if you haven’t invested in defense capabilities in advance?
  Then you probably will periodically need to lose and accept bad situations, such as either a more powerful empire demanding tribute from your country, or choosing policies like “if you are under threat, flip an unknown number of coins and if enough coins come up heads, go to war and punish them disproportionately even though you will probably lose and lots of people will die but now empires will sometimes think twice about invading poor countries.”
  The meta level point
  It doesn’t seem inconsistent to me to apply different policies in different situations, even if they share commonalities, based on how common the situation is, how costly the defection, how much long-term punishment you can inflict, and how much resources your have invested in being able to punish.
  This does mean that mugging (for example) is a somewhat viable strategy, since people don’t invest as heavily in handling it (because it is rare), but this seems like a self-correcting problem. There would be some least-defended against defect-button that defectors can press, you can’t protect against everything.
  Another point is that it’s important to be somewhat unpredictable, and to at least sometimes just punish people disproportionately (when they wrong you), so that people aren’t confident that the expected value of taking advantage of you is positive.
- Lukas Finnveden 27 May 2019 10:02 UTC
  2 points
  0
  Parent
  Any reason why you mention timeless decision theory (TDT) specifically? My impression was that functional decision theory (as well as UDT, since they’re basically the same thing) is regarded as a strict improvement over TDT.
  - habryka 27 May 2019 16:49 UTC
    2 points
    0
    Parent
    Same thing, it’s just the handle that stuck in my mind. I think of the whole class as “timeless”, since I don’t think there exists a good handle that describes all of them.
habryka 15 May 2019 6:00 UTC
15 points
0
Making yourself understandable to other people
(Epistemic status: Processing obvious things that have likely been written many times before, but that are still useful to have written up in my own language)
How do you act in the context of a community that is vetting constrained? I think there are fundamentally two approaches you can use to establish coordination with other parties:
1. Professionalism: Establish that you are taking concrete actions with predictable consequences that are definitely positive
2. Alignment: Establish that you are a competent actor that is acting with intentions that are aligned with the aims of others
I think a lot of the concepts around professionalism arise when you have a group of people who are trying to coordinate, but do not actually have aligned interests. In those situations you will have lots of contracts and commitments to actions that have well-specified outcomes and deviations from those outcomes are generally considered bad. It also encourages a certain suppression of agency and a fear of people doing independent optimization in a way that is not transparent to the rest of the group.
Given a lot of these drawbacks, it seems natural to aim for establishing alignment with others, it is however much less clear how to achieve that. Close group of friends can often act in alignment because they have credibly signaled to each other that they care about each others experiences and goals. This also tends to involve costly signals of sacrifice that are only economical if the goals of the participants were actually aligned. I also suspect that there is a real “merging of utility functions” going on, where close friends and partners self-modify to share each others values.
For larger groups of people, establishing alignment with each other seems much harder, in particular in the presence of adversarial actors. You can request costly signals, but it is often difficult to find good signals that are not prohibitively costly for many members of your group (this task is much easier for smaller groups, since you have less spread in the costs of different actions). You are also under much more adversarial pressure, since with more people you likely have access to more resources which attracts more adversarial actors.
I expect this is the reason why we see larger groups often default to professionalism norms with very clearly defined contracts.
I think the EA and Rationality communities have historically optimized hard for alignment and not professionalism, since that enabled much better overall coordination, but as the community grew and attracted more adversarial actors those methods didn’t scale very well and so we currently expect alignment-level coordination capabilities while only having access to professionalism-level coordination protocols and resources.
We’ve also seen an increase in people trying to increase the amount of alignment, by looking into things like circling and specializing in mediation and facilitation, which I think is pretty promising and has some decent traction.
I also think there is a lot of value in building better infrastructure and tools for more “professionalism” style interactions, where people offer concrete services with bounded upside. A lot of my thinking on the importance of accountability derives from this perspective.
- jp 15 Jun 2020 19:00 UTC
  4 points
  0
  Parent
  I had forgotten this post, reread it and still think it’s one of the better things of it’s length I’ve read recently.
  - habryka 15 Jun 2020 22:37 UTC
    5 points
    0
    Parent
    Glad to hear that! Seems like a good reason to publish this as a top-level post. Might go ahead and do that in the next few days.
    - nicoleross 29 Jun 2020 17:38 UTC
      4 points
      0
      Parent
      +1 for publishing as a top level post
habryka 6 Jul 2021 20:54 UTC
11 points
0
This FB post by Matt Bell on the Delta Variant helped me orient a good amount:
https://www.facebook.com/thismattbell/posts/10161279341706038
As has been the case for almost the entire pandemic, we can predict the future by looking at the present. Let’s tackle the question of “Should I worry about the Delta variant?” There’s now enough data out of Israel and the UK to get a good picture of this, as nearly all cases in Israel and the UK for the last few weeks have been the Delta variant. [1] Israel was until recently the most-vaccinated major country in the world, and is a good analog to the US because they’ve almost entirely used mRNA vaccines.
- If you’re fully vaccinated and aren’t in a high risk group, the Delta variant looks like it might be “just the flu”. There are some scary headlines going around, like “Half of new cases in Israel are among vaccinated people”, but they’re misleading for a couple of reasons. First, since Israel has vaccinated over 80% of the eligible population, the mRNA vaccine still is 1-((0.5/0.8)/(0.5/0.2)) = 75% effective against infection with the Delta variant. Furthermore, the efficacy of the mRNA vaccine is still very high ( > 90%) against hospitalization or death from the Delta variant. Thus, you might still catch Delta if you’re vaccinated, but it will be more like a regular flu if you do. J&J likely has a similar performance in terms of reduced hospitalizations and deaths, as the UK primarily vaccinated its citizens with the AstraZeneca vaccine, which is basically a crappier version of J&J, and is still seeing a 90+% reduction in hospitalizations and deaths among the vaccinated population. [2]
[...]
habryka 8 Jun 2021 4:16 UTC
10 points
0
This seems like potentially a big deal: https://mobile.twitter.com/DrEricDing/status/1402062059890786311

> Troubling—the worst variant to date, the #DeltaVariant is now the new fastest growing variant in US. This is the so-called “Indian” variant #B16172 that is ravaging the UK despite high vaccinations because it has immune evasion properties. Here is why it’s trouble—Thread. #COVID19
- SoerenMind 8 Jun 2021 7:30 UTC
  6 points
  0
  Parent
  There’s also a strong chance that delta is the most transmissible variant we know even without its immune evasion (source: I work on this, don’t have a public source to share). I agree with your assessment that delta is a big deal.
- ChristianKl 8 Jun 2021 16:07 UTC
  4 points
  0
  Parent
  The fact that we still use the same sequence to vaccinate seems like civilisational failure.
- wunan 8 Jun 2021 14:56 UTC
  4 points
  0
  Parent
  Those graphs all show the percentage share of the different variants, but more important would be the actual growth rate. Is the delta variant growing, or is it just shrinking less quickly than the others?
habryka 7 Feb 2023 20:11 UTC
9 points
0
@Elizabeth was interested in me crossposting this comment from the EA Forum since she thinks there isn’t enough writing on the importance of design on LW. So here it is.
Atlas reportedly spent $10,000 on a coffee table. Is this true? Why was the table so expensive?
Atlas at some point bought this table, I think: https://sisyphus-industries.com/product/metal-coffee-table/. At that link it costs around $2200, so I highly doubt the $10,000 number.
Lightcone then bought that table from Atlas a few months ago at the listing price, since Jonas thought the purchase seemed excessive, so Atlas actually didn’t end up paying anything. I am really glad we bought it from them, it’s probably my favorite piece of furniture in the whole venue we are currently renovating.
If you think it was a waste of money, I have made much worse interior design decisions (in-general furniture is really annoyingly expensive, and I’ve bought couches for $2000 that turned out to just not work for us at all and were too hard to sell), and I consider this one a pretty strong hit. (To clarify, the reason why it’s so expensive is because it’s a kinetic sculpture with a moving magnet and a magnetic ball that draws programmable patterns into the sand at the center of the table, so it’s not just like, a pretty coffee table)
The table is currently serving as a centerpiece of our central workspace social room, and has a pretty large effect on good conversations happening since it seems to hit the right balance of being visually interesting without being too distracting while also being functional, and despite this kind of sounding ridiculous, if for some reason it was impossible for Lightcone to pay for this table (which I don’t think it is since I think interior design matters), I would pay for it from my own personal funds.
In general, as someone who has now helped prepare on the order of 20 venues for workshops and conferences, it seems pretty obvious to me that interior design matters quite a bit for workshop venues. I think it would indeed be pretty crazy to pay $2000 for every coffee table in your venue, but a single central design piece can make a huge difference to a room, and I’ve spent hundreds of hours trying to design rooms to facilitate good conversations with my counterfactual earning rate being in the hundreds of dollars per hours, and I think it definitely is sometimes worth my time/money to buy an occasional expensive piece of furniture.
Rob Bensinger 2 May 2019 22:09 UTC
7 points
0
I like this shortform feed idea!
habryka 5 Dec 2020 2:36 UTC
4 points
0
We launched the books on Product Hunt today!
habryka 2 Jan 2020 2:54 UTC
4 points
0
Leaving this here: 2d9797e61e533f03382a515b61e6d6ef2fac514f
- Tetraspace 2 Jan 2020 3:18 UTC
  10 points
  0
  Parent
  Since this hash is publicly posted, is there any timescale for when we should check back to see the preimage?
  - habryka 2 Jan 2020 3:31 UTC
    5 points
    0
    Parent
    If relevant, I will reveal it within the next week.
    - habryka 2 Jan 2020 3:53 UTC
      6 points
      0
      Parent
      Preimage was:
      Said will respond to the comment in https://www.lesswrong.com/posts/jLwFCkNKMCFTCX7rL/circling-as-cousin-to-rationality#rdbGxdxXqJiGHrSAC with a message that has content roughly similar to “I appreciate the effort, but I still don’t think I understand what you are trying to point at”.
      Hashed using https://www.fileformat.info/tool/hash.htm using the SHA-1 hash.

Habryka’s Shortform Feed

Integrity

What is the purpose of karma?

Problems with alternative karma systems