I suggest a fourth default question for these reading groups:
How did this post age?
I suggest a fourth default question for these reading groups:
How did this post age?
Soon the two are lost in a maze of words defined in other words, the problem that Steven Harnad once described as trying to learn Chinese from a Chinese/Chinese dictionary.
Of course, it turned out that LLMs do this just fine, thank you.
intensional terms
Should probably link to Extensions and Intensions; not everyone reads these posts in order.
Mati described himself as a TPM since September 2023 (after being PM support since April 2022), and Andrei described himself as a Research Engineer from April 2023 to March 2024. Why do you believe either was not a FTE at the time?
And while failure to sign isn’t proof of lack of desire to sign, the two are heavily correlated—otherwise it would be incredibly unlikely for the small Superalignment team to have so many members who signed late or not at all.
With the sudden simultaneous exits of Mira Murati, Barret Zoph, and Bob McGrew, I thought I’d update my tally of the departures from OpenAI, collated with how quickly the ex-employee had signed the loyalty letter to Sam Altman last November.
The letter was leaked at 505 signatures, 667 signatures, and finally 702 signatures; in the end, it was reported that 737 of 770 employees signed. Since then, I’ve been able to verify 56 departures of people who were full-time employees (as far as I can tell, contractors were not allowed to sign, but all FTEs were).
I still think I’m missing some, so these are lower bounds (modulo any mistakes I’ve made).
Headline numbers:
Attrition for the 505 OpenAI employees who signed before the letter was first leaked: at least 24⁄505 = 4.8%
Attrition for the next 197 to sign (it was leaked again at 667 signatures, and one last time at 702): at least 13⁄197 = 6.6%
Attrition for the (reported) 68 who had not signed by the last leak: at least 19⁄68 = 27.9%.
Reportedly, 737 out of the 770 signed in the end, and many of the Superalignment team chose not to sign at all.
Below are my current tallies of some notable subsets. Please comment with any corrections!
People from the Superalignment team who never signed as of the 702 leak (including some policy/governance people who seem to have been closely connected) and are now gone:
Carroll Wainwright
Collin Burns
Cullen O’Keefe
Daniel Kokotajlo
Jan Leike (though he did separately Tweet that the board should resign)
Jeffrey Wu
Jonathan Uesato
Leopold Aschenbrenner
Mati Roy
William Saunders
Yuri Burda
People from the Superalignment team (and close collaborators) who did sign before the final leak but are now gone:
Jan Hendrik Kirchner (signed between 668 and 702)
Steven Bills (signed between 668 and 702)
John Schulman (signed between 506 and 667)
Sherry Lachman (signed between 506 and 667)
Ilya Sutskever (signed by 505)
Pavel Izmailov (signed by 505)
Ryan Lowe (signed by 505)
Todor Markov (signed by 505)
Others who didn’t sign as of the 702 leak (some of whom may have just been AFK for the wrong weekend, though I doubt that was true of Karpathy) and are now gone:
Andrei Alexandru (Research Engineer)
Andrej Karpathy (Co-Founder)
Austin Wiseman (Finance/Accounting)
Girish Sastry (Policy)
Jay Joshi (Recruiting)
Katarina Slama (Member of Technical Staff)
Lucas Negritto (Member of Technical Staff, then Developer Community Ambassador)
Zarina Stanik (Marketing)
Notable other ex-employees:
Barrett Zoph (VP of Research, Post-Training; signed by 505)
Bob McGrew (Chief Research Officer; signed by 505)
Chris Clark (Head of Nonprofit and Strategic Initiatives; signed by 505)
Diane Yoon (VP of People; signed by 505)
Gretchen Krueger (Policy; signed by 505; posted a significant Twitter thread at the time she left)
Mira Murati (CTO; signed by 505)
EDIT: On reflection, I made this a full Shortform post.
With the sudden simultaneous exits of Mira Murati, Barret Zoph, and Bob McGrew, I thought I’d do a more thorough scan of the departures. I still think I’m missing some, so these are lower bounds (modulo any mistakes I’ve made).
Headline numbers:
Attrition for the 505 OpenAI employees who signed before the letter was first leaked: at least 24⁄505 = 4.8%
Attrition for the next 197 to sign (it was leaked again at 667 signatures, and one last time at 702): at least 13⁄197 = 6.6%
Attrition for the (reported) 68 who had not signed by the last leak: at least 19⁄68 = 27.9%.
Reportedly, 737 out of the 770 signed in the end, and many of the Superalignment team chose not to sign at all.
Below are my current tallies of some notable subsets. Please comment with any corrections!
People from the Superalignment team who never signed as of the 702 leak (including some policy/governance people who seem to have been closely connected) and are now gone:
Carroll Wainwright
Collin Burns
Cullen O’Keefe
Daniel Kokotajlo
Jan Leike (though he did separately Tweet that the board should resign)
Jeffrey Wu
Jonathan Uesato
Leopold Aschenbrenner
Mati Roy
William Saunders
Yuri Burda
People from the Superalignment team (and close collaborators) who did sign before the final leak but are now gone:
Jan Hendrik Kirchner (signed between 668 and 702)
Steven Bills (signed between 668 and 702)
John Schulman (signed between 506 and 667)
Sherry Lachman (signed between 506 and 667)
Ilya Sutskever (signed by 505)
Pavel Izmailov (signed by 505)
Ryan Lowe (signed by 505)
Todor Markov (signed by 505)
Others who didn’t sign as of the 702 leak (some of whom may have just been AFK for the wrong weekend, though I doubt that was true of Karpathy) and are now gone:
Andrei Alexandru (Research Engineer)
Andrej Karpathy (Co-Founder)
Austin Wiseman (Finance/Accounting)
Girish Sastry (Policy)
Jay Joshi (Recruiting)
Katarina Slama (Member of Technical Staff)
Lucas Negritto (Member of Technical Staff, then Developer Community Ambassador)
Zarina Stanik (Marketing)
Notable other ex-employees:
Barrett Zoph (VP of Research, Post-Training; signed by 505)
Bob McGrew (Chief Research Officer; signed by 505)
Chris Clark (Head of Nonprofit and Strategic Initiatives; signed by 505)
Diane Yoon (VP of People; signed by 505)
Gretchen Krueger (Policy; signed by 505; posted a significant Twitter thread at the time)
Mira Murati (CTO; signed by 505)
CDT agents respond well to threats
Might want to rephrase this as “CDT agents give in to threats”
This is weirdly meta.
If families are worried about the cost of groceries, they should welcome this price discrimination. The AI will realize you are worried about costs. It will offer you prime discounts to win your business. It will know you are willing to switch brands to get discounts, and use this to balance inventory.
Then it will go out and charge other people more, because they can afford to pay. Indeed, this is highly progressive policy. The wealthier you are, the more you will pay for groceries. What’s not to love?
A problem is that this is not only a tax on indifference, but also a tax on innumeracy and on lack of leisure time. Those who don’t know how to properly comparison shop are likely to be less wealthy, not more; same with those who don’t have the spare time to go to more than one store.
Re: experience machine, Past Me would have refused it and Present Me would take it. The difference is due to a major (and seemingly irreversible) deterioration in my wellbeing several years ago, but not only because that makes the real world less enjoyable.
Agency is another big reason to refuse the experience machine; if I think I can make a difference in the base-level world, I feel a moral responsibility towards it. But I experience significantly less agency now (and project less agency in the future), so that factor is diminished for me.
The main factor that’s still operative is epistemics: I would much rather my beliefs be accurate than be deceived about the world. But it’s hard for that to outweigh the unhappiness at this point.
So if a lot of people would choose the Experience Machine, that suggests they are some combination of unhappy, not confident in their agency, and not obsessed with their epistemics. (Which does, I think, operationalize your “something is very wrong”.)
Thanks—I didn’t recall the content of Yglesias’ tweet, and I’d noped out of sorting through his long feed. I suspect Yglesias didn’t understand why the numbers were weird, though, and people who read his tweet were even less likely to get it. And most significantly, he tries to draw a conclusion from a spurious fact!
Allowing explicitly conditional markets with a different fee structure (ideally, all fees refunded on the counterfactual markets) could be an interesting public service on Manifold’s part.
The only part of my tone that worries me in retrospect is that I should have done more to indicate that you personally were trying to do a good thing, and I’m criticizing the deference to conditional markets rather than criticizing your actions. I’ll see if I can edit the post to improve on that axis.
I think we still differ on that. Even though the numbers for the main contenders were just a few points apart, there was massive jockeying to put certain candidates at the top end of that range, because relative position is what viewers noticed.
I’m really impressed with your grace in writing this comment (as well as the one you wrote on the market itself), and it makes me feel better about Manifold’s public epistemics.
Yes, and I gained some easy mana from such markets; but the market that got the most attention by far was the intrinsically flawed conditional market.
Real-money markets do have stronger incentives for sharps to scour for arbitrage, so the 1/1/26 market would have been more likely to be noticed before months had gone by.
However (depending on the fee structure for resolving N/A markets), real-money markets have even stronger incentives for sharps to stay away entirely from spurious conditional markets, since they’d be throwing away cash and not just Internet points. Never ever ever cite out-of-the-money conditional markets.
Broke: Prediction markets are an aggregation of opinions, weighted towards informed opinions by smart people, and are therefore a trustworthy forecasting tool on any question.
Woke: Prediction markets are MMOs set in a fantasy world where, if someone is Wrong On The Internet, you can take their lunch money.
Can you share any strong evidence that you’re an unusually trustworthy person in regard to confidential conversations? People would in fact be risking a lot by talking to you.
(This is sincere btw; I think this service should absolutely exist, but the best version of it is probably done by someone with a longstanding public reputation of circumspection.)
Good question! I picked it up from a friend at a LW meetup a decade ago, so it didn’t come with all the extra baggage that vipassana meditation seems to usually carry. So this is just going to be the echo of it that works for me.
Step 1 is to stare at your index finger (a very sensitive part of your body) and gently, patiently try to notice that it’s still producing a background level of sensory stimulus even when it’s not touching anything. That attention to the background signal, focused on a small patch of your body, is what the body scan is based on.
Step 2 is learning how to “move” that awareness of the background signal slowly. Try to smoothly shift that awareness down your finger, knuckle by knuckle, keeping the area of awareness small by ceasing to focus on the original spot as you focus on a new spot. Then try moving that spot of awareness gradually to the base of your thumb, and noticing the muscle beneath the skin.
Use Case α is harnessing that kind of awareness to relax physical tension and even pain. The next time you have a paper cut or a small burn, once you’ve dealt with it in the obvious objective ways and now just have to handle the pain, focus your awareness right on that spot. The sensation will still be loud, but it won’t be overwhelming when you’re focusing on it rather than fleeing from it. Or the next time you notice a particularly tense muscle, focus your awareness there; for me, that usually loosens it at least a little.
Step 3 is the body scan itself: creating awareness for each part of your skin and muscles, gradually, bit by bit, starting from the crown of your head and slowly tracing out a path that covers everything. This is where a guided meditation could really help. I don’t have one to recommend (after having the guided meditation at the meetup, I got as much of the idea as I needed), but hopefully some of the hundreds out there are as good as Random Meditating Rationalist #37 was.
And Use Case β, when you have a migraine, is to imagine moving that awareness inside your skull, to the place where the migraine pain feels like it’s concentrated. (I recommend starting from a place where the migraine seems to “surface”—for me, the upper orbit of my left eye—if you have such a spot.)
There’s something quite odd about how this works: your brain doesn’t have pain receptors, so the pain from the migraine ends up in some phantom location on your body map, and it’s (conveniently?) interpreted as being inside your head. By tracing your awareness inside your skull, you walk along that body map to the same phantom location as that pain, so it works out basically the same as if you were in Use Case α.
Hope this helps!
I have to further compliment my past self: this section aged extremely well, prefiguring the Shoggoth-with-a-smiley-face analogies several years in advance.
GPT-3 is trained simply to predict continuations of text. So what would it actually optimize for, if it had a pretty good model of the world including itself and the ability to make plans in that world?
One might hope that because it’s learning to imitate humans in an unsupervised way, that it would end up fairly human, or at least act in that way. I very much doubt this, for the following reason:
Two humans are fairly similar to each other, because they have very similar architectures and are learning to succeed in the same environment.
Two convergently evolved species will be similar in some ways but not others, because they have different architectures but the same environmental pressures.
A mimic species will be similar in some ways but not others to the species it mimics, because even if they share recent ancestry, the environmental pressures on the poisonous one are different from the environmental pressures on the mimic.
What we have with the GPTs is the first deep learning architecture we’ve found that scales this well in the domain (so, probably not that much like our particular architecture), learning to mimic humans rather than growing in an environment with similar pressures. Why should we expect it to be anything but very alien under the hood, or to continue acting human once its actions take us outside of the training distribution?
Moreover, there may be much more going on under the hood than we realize; it may take much more general cognitive power to learn and imitate the patterns of humans, than it requires us to execute those patterns.
The fault does not lie with Jacob, but wow, this post aged like an open bag of bread.