Running Lightcone Infrastructure, which runs LessWrong. You can reach me at habryka@lesswrong.com. I have signed no contracts or agreements whose existence I cannot mention.
habryka(Oliver Habryka)
Promoted to curated: I’ve really appreciated a lot of the sequence you’ve been writing about various epistemic issues around the EA (and to some degree the rationality) community. This post feels like an appropriate capstone to that work and I quite like it as a positive pointer to a culture that I wish had more adherents.
One reason I feel interested in liability is because it opens up a way to do legal investigations. The legal system has a huge number of privileges that you get to use if you have reasonable suspicion someone has committed a crime or is being negligent. I think it’s quite likely that if there was no direct liability, that even if Microsoft or OpenAI causes some huge catastrophe, that we would never get a proper postmortem or analysis of the facts, and would never reach high-confidence on the actual root-causes.
So while I agree that OpenAI and Microsoft want to of course already avoid being seen as responsible for a large catastrophe, having legal liability makes it much more likely there will be an actual investigation where e.g. the legal system gets to confiscate servers and messages to analyze what happens, which makes it then more likely that if OpenAI and Microsoft are responsible, they will be found out to be responsible.
Not sure what you mean by “underrated”. The fact that they have $300MM from Vitalik but haven’t really done much anyways was a downgrade in my books.
I am not that confident about this. Or like, I don’t know, I do notice my psychological relationship to “all the stars explode” and “earth explodes” is very different, and I am not good enough at morality to be confident about dismissing that difference.
I disagree. I think it matters a good amount. Like if the risk scenario is indeed “humans will probably get a solar system or two because it’s cheap from the perspective of the AI”. I also think there is a risk of AI torturing the uploads it has, and I agree that if that is the reason why humans are still alive then I would feel comfortable bracketing it, but I think Ryan is arguing more that something like “humans will get a solar system or two and basically get to have decent lives”.
(I missed “this was in the works for a while” on my first read of your comment.)
No, I just gaslit you. I edited it when I saw your reaction as a clarification. Sorry about that, should have left a note that I edited it.
The timing makes me think it didn’t happen on schedule and they are announcing this now in response to save face and pre-empt bad PR from this post (though I am only like 75% confident that something like that is going on, and my guess is the appointment itself has been in the works for a while). Seems IMO like a bad sign to do that without being clear about the timing and the degree to which a past commitment was violated.
(Also importantly, they said they would appoint a fifth board-member, but instead it seems like this board member replaced Luke, so they actually stuck to 4)
FWIW I still stand behind the arguments that I made in that old thread with Paul. I do think the game-theoretical considerations for AI maybe allowing some humans to survive are stronger, but they also feel loopy and like they depend on how good of a job we do on alignment, so I usually like to bracket them in conversations like this (though I agree it’s relevant for the prediction of whether AI will kill literally everyone).
I also added one to my profile!
Welcome! I hope you have a good time here!
Maybe I am being dumb, but why not do things on the basis of “actual FLOPs” instead of “effective FLOPs”? Seems like there is a relatively simple fact-of-the-matter about how many actual FLOPs were performed in the training of a model, and that seems like a reasonable basis on which to base regulation and evals.
But 5 years ago I got sick with a viral illness and never recovered. For the last couple of years I’ve been spending most of my now-limited brainpower trying to figure out how I can get better.
Man, I didn’t know about this and am really sorry to hear about that. I have had two partners who also have been dealing with kind of messy chronic illness-adjacent things and so have a lot of sympathy for this (and went through a few months of probbly-mold-poisoning last year which shared a lot of structure with the cases of mysterious chronic illness that I’ve seen).
Promoted to curated: I thought this sequence/report was a really valuable read and I have brought it up in a bunch of circumstances in which it came out.
Detailed and public historical case studies seem very undersupplied when thinking about the strategic considerations around AI alignment and AI safety (as well as the general study of memetics and social coordination).
A top-level post feels a bit weird for this. I would create a shortform post or a comment in the Open Thread for this.
Ah, sure. I didn’t meant to say much about implying a large positive update here, and mostly intended to say “are you saying it’s not to any kind of substantial discredit here?”.
Sure, but Austin answered the fully general question of “how [have you] updated regarding Sam’s trustworthiness over the past few days[?]” with “I haven’t updated majorly negatively”, in a generic tone.
When I say “the evidence is about the filtering” I am saying “the thing that seems like the obvious update would be about would be the filtering, not what the filtering was hiding”.
I agree that one can keep a separate ledger, but to not make a major negative update on Sam in the aggregate based on the information that was released requires IMO either that one already knew about such things and had the information integrated (which would presumably result in a low opinion of Sam’s conduct) or a distinct lack of moral compass (or third, a positive update that happened to mostly cancel out the negative update, though I think it would be confusing to communicate that via saying “I [updated] only small amounts”).
Sure, but the evidence is about the filtering that has occurred and how the filtering was conducted, not about what the filters were hiding. Threatening someone with violence to not insult you is bad independently of whether they had anything to insult you about.
Lots of people on the LW team have gotten pretty good at using prompting, editing and art generation. My guess is I have had the biggest effect on our art choices, followed by Raemon, followed by Ben, followed by kave (who wrote our art pipeline for the new Best of LessWrong page).
Wait, to be clear, are you saying that you think it would be to Sam’s credit to learn that he forced employees to sign NDAs by straightforwardly lying to them about their legal obligations, using extremely adversarial time pressure tactics and making very intense but vague threats?
This behavior seems really obviously indefensible.
I don’t have a strong take on the ScarJo thing. I don’t really see how it would be to his credit, my guess is he straightforwardly lied about his intention to make the voice sound like ScarJo, but that’s of course very hard to verify, and it wouldn’t be a big deal either way IMO.
I don’t think any of this is moot, since the thing that is IMO most concerning is people signing these contracts, then going into policy or leadership positions and not disclosing that they signed those contracts. Those things happened in the past and are real breaches of trust.