Based on my own retrospective views of how lightcone’s office went less-than-optimally, I recently gave some recommendations to someone maybe setting up another alignment research space. (Background: I’ve been working in the lightcone office since shortly after it opened.) They might be of interest to people mining this post for insights on how to execute similar future spaces. Here they are, lightly edited:
I recommend selecting for people who want to understand agents, instead of people who want to reduce AI X-risk. When I think about people who bring an attitude of curiosity/exploration to alignment work, the main unifying pattern I notice is that they want to understand agents, as opposed to just avoid doom.
I recommend selecting for people who are self-improving a lot, and/or want to help others improve a lot. Alex Turner or Nate Soares are good examples of people who score highly on this axis.
For each of the above, I recommend that you aim for a clear majority (i.e. at least 60-70%) of people in the office to score highly on the relevant metric. So i.e. aim for at least 60-70% of people to be trying to understand agents, and separately at least 60-70% of people be trying to self-improve a lot and/or help others self-improve a lot.
(The reasoning here is about steering the general vibe, kinds of conversations people have by default, that sort of thing. Which characteristics to select for obviously depend on what kind of vibe you want.)
(One key load-bearing point about both of the above characteristics is that they’re not synonymous with general competence or status. Regardless of what vibe you’re going for, the characteristics on which you select should be such that there are competent and/or high-status people who you’d say “no” to. Otherwise, you’ll probably slide much harder into status dynamics.)
I recommend maintaining a very low fraction of people who are primarily doing meta-stuff, i.e. field-building, strategizing, forecasting, etc.
(This is also about steering vibe and default conversation topics. You don’t want enough meta-people that they reach critical concentration for conversational purposes. Also, critical concentration tends to be lower for more-accessible topics. This is one of the major places where I think lightcone went wrong; conversations defaulted to meta-stuff far too much as a result.)
It might be psychologically helpful to have people pay for their office space, even if it’s heavily subsidized by grant money. If you give something away for free, there will always be people lined up to take it, which forces you to gatekeep a lot. Gatekeeping in turn amplifies the sort of unpleasant status dynamics which burned out the lightcone team pretty hard; that’s what happens when allocation of scarce resources is by status rather than by money. If e.g. the standard guest policy is “sure, you can bring guests/new people, but you need to pay for them” (maybe past some minimum allowance), then there will be a lot fewer people who you need to say “no” to and who feel like shit as a result.
That seems right to me, but I interpreted the above for advice for one office, potentially a somewhat smaller one. Seems fine to me to have one hub for people who think more through the lens of agency.
I mostly endorse having one office concentrate on one research agenda and be able to have high quality conversations on it, and the stated numbers of maybe 10 to 20% people working on strategy/meta sounds fine in that context. Still I want to emphasize how crucial they are—If you have no one to figure out the path between your technical work and overall reducing risk, you’re probably missing better paths and approaches (and maybe not realizing your work is useless).
Overall I’d say we don’t have enough strategy work being done, and believe it’s warranted to have spaces with 70% of people working on strategy/meta. I don’t think it was bad if the Lightcone office had a lot of strategy work. (We probably also don’t have enough technical alignment work, having more of both is probably good, if we coordinate properly)
Based on my own retrospective views of how lightcone’s office went less-than-optimally, I recently gave some recommendations to someone maybe setting up another alignment research space. (Background: I’ve been working in the lightcone office since shortly after it opened.) They might be of interest to people mining this post for insights on how to execute similar future spaces. Here they are, lightly edited:
I recommend selecting for people who want to understand agents, instead of people who want to reduce AI X-risk. When I think about people who bring an attitude of curiosity/exploration to alignment work, the main unifying pattern I notice is that they want to understand agents, as opposed to just avoid doom.
I recommend selecting for people who are self-improving a lot, and/or want to help others improve a lot. Alex Turner or Nate Soares are good examples of people who score highly on this axis.
For each of the above, I recommend that you aim for a clear majority (i.e. at least 60-70%) of people in the office to score highly on the relevant metric. So i.e. aim for at least 60-70% of people to be trying to understand agents, and separately at least 60-70% of people be trying to self-improve a lot and/or help others self-improve a lot.
(The reasoning here is about steering the general vibe, kinds of conversations people have by default, that sort of thing. Which characteristics to select for obviously depend on what kind of vibe you want.)
(One key load-bearing point about both of the above characteristics is that they’re not synonymous with general competence or status. Regardless of what vibe you’re going for, the characteristics on which you select should be such that there are competent and/or high-status people who you’d say “no” to. Otherwise, you’ll probably slide much harder into status dynamics.)
I recommend maintaining a very low fraction of people who are primarily doing meta-stuff, i.e. field-building, strategizing, forecasting, etc.
(This is also about steering vibe and default conversation topics. You don’t want enough meta-people that they reach critical concentration for conversational purposes. Also, critical concentration tends to be lower for more-accessible topics. This is one of the major places where I think lightcone went wrong; conversations defaulted to meta-stuff far too much as a result.)
It might be psychologically helpful to have people pay for their office space, even if it’s heavily subsidized by grant money. If you give something away for free, there will always be people lined up to take it, which forces you to gatekeep a lot. Gatekeeping in turn amplifies the sort of unpleasant status dynamics which burned out the lightcone team pretty hard; that’s what happens when allocation of scarce resources is by status rather than by money. If e.g. the standard guest policy is “sure, you can bring guests/new people, but you need to pay for them” (maybe past some minimum allowance), then there will be a lot fewer people who you need to say “no” to and who feel like shit as a result.
Strong disagree. I think locking in particular paradigms of how to do AI safety research would be quite bad.
That seems right to me, but I interpreted the above for advice for one office, potentially a somewhat smaller one. Seems fine to me to have one hub for people who think more through the lens of agency.
I mostly endorse having one office concentrate on one research agenda and be able to have high quality conversations on it, and the stated numbers of maybe 10 to 20% people working on strategy/meta sounds fine in that context. Still I want to emphasize how crucial they are—If you have no one to figure out the path between your technical work and overall reducing risk, you’re probably missing better paths and approaches (and maybe not realizing your work is useless).
Overall I’d say we don’t have enough strategy work being done, and believe it’s warranted to have spaces with 70% of people working on strategy/meta. I don’t think it was bad if the Lightcone office had a lot of strategy work. (We probably also don’t have enough technical alignment work, having more of both is probably good, if we coordinate properly)