habryka

Karma: 43,357

Running Lightcone Infrastructure, which runs LessWrong and Lighthaven.space. You can reach me at habryka@lesswrong.com.

(I have signed no contracts or agreements whose existence I cannot mention, which I am mentioning here as a canary)

habryka May 31, 2025, 7:23 PM
4 points
3
in reply to: Duncan Sabien (Inactive)’s comment on: Truth or Dare
I haven’t thought that much about how I feel about this kind of thing in general, but in this case it seems clear that that section is definitely compatible with Zack’s norms, and so it feels in that context totally fine to me (I haven’t thought much about how I would feel about it for other people).

habryka May 28, 2025, 10:41 PM
3 points
0
in reply to: Wei Dai’s comment on: Wei Dai’s Shortform
I think we are unlikely to do #2 based on my current guesses of what are good ideas. I think #1 is also kind of unlikely. I think some version of 3,4 and 5 are definitely things I want to explore.

habryka May 28, 2025, 9:44 PM
6 points
0
in reply to: Wei Dai’s comment on: Wei Dai’s Shortform
I would like to do more work on this kind of stuff, and expect to do so after a current big batch of back-end refactors is done (not commenting on whether we might do any of these specific AI features, but it seems clear that we will want to figure out how to integrate AI into both discussion and content production on LW somehow).

habryka May 28, 2025, 7:34 PM
6 points
2
on: LessWrong Feed [new, now in beta]
I’ve been having a decent time using this for my LW frontpage for the past few weeks! Biggest improvement is just the average content quality compared to recent discussion which is strictly recency sorted, whereas this tends to more reliably show me stuff I want to read and haven’t read yet.

habryka May 28, 2025, 5:27 PM
97 points
47
in reply to: Zac Hatfield-Dodds’s comment on: Zac Hatfield Dodds’s Shortform
Does he have anything public about his thoughts on AI risk? The announcement concerningly focuses on job displacement, which seems largely irrelevant to what I (and I think most other people who have thought about this hard) consider most important to supervise about Anthropic’s actions. Has he ever said or written anything about catastrophic or existential risk, or risk of substantial human disempowerment?

habryka May 27, 2025, 11:59 PM
2 points
0
in reply to: lc’s comment on: lc’s Shortform
Reach out to us on Intercom (either here on LW or at less.online) and we will fix it for you!

habryka May 26, 2025, 7:36 PM
2 points
0
in reply to: Neel Nanda’s comment on: Socratic Persuasion: Giving Opinionated Yet Truth-Seeking Advice
Hmm, this section still reads as really discordant to me when I skim it (the rest is fine).

habryka May 26, 2025, 6:35 PM
4 points
2
on: Socratic Persuasion: Giving Opinionated Yet Truth-Seeking Advice
Writing feedback: Too many bolded words for me in the first section, I ended up bouncing off of it because of that (I often bounce off of things, so this isn’t an enormous amount of evidence).

habryka May 26, 2025, 6:26 PM
7 points
2
in reply to: Zac Hatfield-Dodds’s comment on: New scorecard evaluating AI companies on safety
A quick take from me (I did some of the design on the site, though Ray did most of it): I think the numbers are helpful for organizing the content into meaningful categories, and helps people figure out where the interesting content is. Otherwise you would be dealing with a huge amount of prose. I currently think the numbers/table is a pretty decent way to get a sense of what the content is, and where it makes sense to pay attention to (usually the places where there is the most variance in numbers across a category, or where scores are particularly low or high).

habryka May 26, 2025, 6:24 PM
2 points
0
in reply to: Viliam’s comment on: AI #117: OpenAI Buys Device Maker IO
Yeah, agree with that, though given that I think many good futures route through substantial pauses, or substantial improvements in human coordination technology, mapping out the degree to which AI systems can uplift people before it is capable of disempowering them is a pretty crucial thing to map, so I don’t super agree with this equivocation.

habryka May 26, 2025, 4:53 PM
4 points
0
in reply to: Viliam’s comment on: AI #117: OpenAI Buys Device Maker IO
I don’t understand. Only a very small fraction of people are teachers. Education does not exist primarily to make more teachers, it’s to help make everything that is productive. Seems great for humans to no longer have to be teachers.

habryka May 25, 2025, 6:36 PM
2 points
0
in reply to: ryan_greenblatt’s comment on: Winning the power to lose
So, in the regime where the government is seriously considering the tradeoffs and taking strong actions, I’d guess 0.1% is closer to rational (if you don’t have a preference against the development of powerful AI regardless of misalignment risk which might be close to the preference of many people).
Ah, sorry, if you are taking into account exogenous shifts in risk-attitudes and how careful people are, from a high baseline, I agree this makes sense. I was reading things as a straightforward 0.1% existential risk vs. 1 year of benefits from AI.

habryka May 25, 2025, 5:25 PM
8 points
0
in reply to: ryan_greenblatt’s comment on: Winning the power to lose
I’d say that my all considered tradeoff curve is something like 0.1% existential risk per year of delay
For what it’s worth, from a societal perspective this seems very aggressive to me and a big outlier in human preferences. I would be extremely surprised if any government in the world would currently choose a 0.1% risk of extinction in order to accelerate AGI development by 1 year, if they actually faced that tradeoff directly. My guess is society-endorsed levels are closer to 0.01%.

habryka May 23, 2025, 4:16 AM
4 points
0
in reply to: Yoav Ravid’s comment on: Yoav Ravid’s Shortform
Oh, huh, yeah. Seems very reasonable to fix. I’ll add it to the list.

habryka May 21, 2025, 11:24 PM
10 points
11
in reply to: ryan_greenblatt’s comment on: Winning the power to lose
(I will go on the record that I think this comment seems to me terribly confused about what “LW style theoretic research” is. In-particular, I think of Redwood as one of the top organizations doing LW style theoretic research, with a small empirical component, and so clearly some kind of mismatch about concepts is going on here. AI 2027 also strikes me as very centrally the kind of “theoretical” thinking that characterizes LW.
My sense is some kind of weird thing is happening where people conjure up some extremely specific thing as the archetype of LW-style research, in ways that is kind of disconnected from reality, and I would like to avoid people forming annoyingly hard to fix stereotypes as a result of that)

habryka May 16, 2025, 6:06 PM
6 points
0
in reply to: Bucky’s comment on: Eliezer and I wrote a book: If Anyone Builds It, Everyone Dies
Huh, that sure is weird. Looking into it, it seems that this would only happen if the institution network is forcing outdated SSL protocols, which really isn’t great since SSL exploits seem reasonably common and very bad.
https://vercel.com/guides/resolve-err-ssl-protocol-error-with-vercel
Not much I can do for now. I hope not many networks do this. If they do, I might think about doing something complicated to work around it.

habryka May 16, 2025, 12:46 AM
24 points
2
on: Staying in a Capsule Hotel
When furnishing Lighthaven I was also very surprised how little capsules from capsule hotels optimized around sound isolation. My sense is that it’s partially the result of building codes making it so that the more you make things out of real walls, the more you risk being classified as a room (instead of a piece of furniture), which would make you illegal. Many places like San Francisco also require capsules in capsule hotels to not have any doors, but instead to just use curtains, which also completely trashes sound isolation.

habryka May 11, 2025, 10:02 PM
4 points
0
in reply to: Arthur Conmy’s comment on: AI for Epistemics Hackathon
Nope, I primarily use o3 these days, and have made tweaks to my system prompt, but because it’s a thinking model my use of it is a lot less conversational and so the system prompt here isn’t that helpful, and I haven’t tried to make it work with o3.

habryka May 8, 2025, 3:50 AM
17 points
1
in reply to: evhub’s comment on: Eukryt Wrts Blg
Inviting someone to an event seems somewhat closer, though.
Yeah, in this case we are talking about “attending an event where someone you think is evil is invited to attend”, which is narrower, but also strikes me as an untenable position (e.g. in the case of the lab case, this would prevent me from attending almost any conference I can think of wanting to attend in the Bay Area, almost all of which routinely invite frontier lab employees as speakers or featured guests).
To be clear, I think it’s reasonable to be frustrated with Lightcone if you think we legitimize people who you think will misuse that legitimacy, but IMO refusing to attend any events where an organizer makes that kind of choice seems very intense to me (though of course, if someone was already considering attending an event as being of marginal value, such a thing could push you over the edge, though I think this would produce a different top-level comment).
I’m also not really sure what you’re hinting at with “I hope you also advocate for it when it’s harder to defend.” I assume something about what I think about working at AI labs? I feel like my position on that was fairly clear in my previous comment.
It’s mostly an expression of hope. For example, I hope it’s a genuine commitment that will result in you saying so, even if you might end up in the unfortunate position of updating negatively on Anthropic, or being friends and allies with lots of people at other organizations that you updated negatively on.
As a reason for this being hope instead of confidence: I do not remember you (or almost anyone else in your position) calling for people to leave their positions at OpenAI when it became more clear the organization was likely harming the world, though maybe I just missed it. I am not intending this to be some confident “gotcha”, just me hinting that people often like to do moral grandstanding in this domain without actual backing deep commitments.
To be clear, this wasn’t an intention to drag the whole topic into this conversation, but was trying to be a low-key and indirect expression of me viewing some of the things you say here with some skepticism. I don’t super want to put you on the spot to justify your whole position here, but also would have felt amiss to not give any hints of how I relate to them. So feel free to not respond, as I am sure we will find better contexts in which we can discuss these things.

habryka May 8, 2025, 3:23 AM
16 points
3
in reply to: evhub’s comment on: Eukryt Wrts Blg
I think John’s comment, in the context of this thread, was describing a level of “working with” that was in the reference class of “attending an event with” and less “working for an organization” and the usual commitments and relationship that entails, so extending it to that case feels a bit like a non-sequitur. He explicitly mentioned attending an event as the example of the kind of “working with” he was talking about, so responding to only a non-central case of it feels weird.
It is also otherwise the case that in our social circle, the position of “work for organizations that you think are very bad for the world in order to make it better” is a relatively common take (though in that case I think we two appear be in rough agreement that it’s rarely worth it), and I hope you also advocate for it when it’s harder to defend.
Given common beliefs about AI companies in our extended social circle, I think it illustrates pretty nicely why extending an attitude about association-policing that extends all the way to “mutual event attendance” would void a huge number of potential trades and opportunities for compromise and surface area to change one’s mind, and is a bad idea.