Epistemic status: very uncertain, could be very wrong, have been privately thinking in this direction for a while (triggered by FTX and DeepMind/OpenAI/Anthropic), writing down my thoughts for feedback
How much sense does encouraging individual agency actually make, given that humans aren’t safe and x-risk is almost entirely of anthropogenic origin?
Maybe certain kinds of lack of agency are good from a human/AI safety perspective: mild optimization, limited impact, corrigibility, non-power-seeking. Maybe we should encourage adoption of these ideas for humans as well as AIs?
With less individual agency generally, we’d be in a more “long reflection” like situation by default. Humanity as a whole would still needs agency to protect itself against x-risk, eventually reach the stars, etc., but from this perspective we need to think about and spread things like interest in reflection/philosophy/discourse, good epistemic/deliberation norms/institutions, ways to discourage or otherwise protect civilization against individual “rogue agents”, with the aim of collectively eventually figuring out what we want to do and how to do it.
It could be that that ship has already sailed (we already have too much individual agency and can’t get to a “long reflection” like world quickly enough by discouraging it), and we just have to “fight fire with fire” now. But 1) that’s not totally clear (what is the actual mechanistic theory behind “fight fire with fire”? how do we make sure we’re not just setting more fires and causing the world to burn down faster or with higher probability?) and 2) we should at least acknowledge the unfortunate situation instead of treating individual agency as an unmitigated good.
My opinion is that the ship has already sailed; AI timelines are too short & the path to AGI too ‘no agency required’ that even a significant decrease in agency worldwide would not really buy us much time at all, if any.
The mechanistic theory behind “fight fire with fire” is all the usual stories for how we can avoid AGI doom by e.g. doing alignment research, governance/coordination, etc.
A long reflection requires new institutions, and creating new institutions requires individual agency. Right? I have trouble imagining a long reflection actually happening in a world with the individual agency level dialed down.
A separate point that’s perhaps in line with your thinking: I feel better about cultivating agency in people who are intelligent and wise rather than people who are not. When I was working on agency-cultivating projects, we targeted those kinds of people.
A long reflection requires new institutions, and creating new institutions requires individual agency. Right?
Seems like there’s already enough people in the world with naturally high individual agency that (if time wasn’t an issue) you could build the necessary institutions by convincing them to work in that direction.
I have trouble imagining a long reflection actually happening in a world with the individual agency level dialed down.
Yeah if the dial down happened before good institutions were built for the long reflection, that could be bad, but also seems relatively easy to avoid (don’t start trying to dial down individual agency before building the institutions).
I feel better about cultivating agency in people who are intelligent and wise rather than people who are not.
Yeah that seems better to me too. But this made me think of another consideration:
if individual agency (in general, or the specific kind you’re trying to cultivate) is in fact dangerous, there may also be a selection effect in the opposite direction, where the wise intuitively shy away from being cultivated (without necessarily being able to articulate why).
Epistemic status: very uncertain, could be very wrong, have been privately thinking in this direction for a while (triggered by FTX and DeepMind/OpenAI/Anthropic), writing down my thoughts for feedback
How much sense does encouraging individual agency actually make, given that humans aren’t safe and x-risk is almost entirely of anthropogenic origin?
Maybe certain kinds of lack of agency are good from a human/AI safety perspective: mild optimization, limited impact, corrigibility, non-power-seeking. Maybe we should encourage adoption of these ideas for humans as well as AIs?
With less individual agency generally, we’d be in a more “long reflection” like situation by default. Humanity as a whole would still needs agency to protect itself against x-risk, eventually reach the stars, etc., but from this perspective we need to think about and spread things like interest in reflection/philosophy/discourse, good epistemic/deliberation norms/institutions, ways to discourage or otherwise protect civilization against individual “rogue agents”, with the aim of collectively eventually figuring out what we want to do and how to do it.
It could be that that ship has already sailed (we already have too much individual agency and can’t get to a “long reflection” like world quickly enough by discouraging it), and we just have to “fight fire with fire” now. But 1) that’s not totally clear (what is the actual mechanistic theory behind “fight fire with fire”? how do we make sure we’re not just setting more fires and causing the world to burn down faster or with higher probability?) and 2) we should at least acknowledge the unfortunate situation instead of treating individual agency as an unmitigated good.
My opinion is that the ship has already sailed; AI timelines are too short & the path to AGI too ‘no agency required’ that even a significant decrease in agency worldwide would not really buy us much time at all, if any.
The mechanistic theory behind “fight fire with fire” is all the usual stories for how we can avoid AGI doom by e.g. doing alignment research, governance/coordination, etc.
A long reflection requires new institutions, and creating new institutions requires individual agency. Right? I have trouble imagining a long reflection actually happening in a world with the individual agency level dialed down.
A separate point that’s perhaps in line with your thinking: I feel better about cultivating agency in people who are intelligent and wise rather than people who are not. When I was working on agency-cultivating projects, we targeted those kinds of people.
Seems like there’s already enough people in the world with naturally high individual agency that (if time wasn’t an issue) you could build the necessary institutions by convincing them to work in that direction.
Yeah if the dial down happened before good institutions were built for the long reflection, that could be bad, but also seems relatively easy to avoid (don’t start trying to dial down individual agency before building the institutions).
Yeah that seems better to me too. But this made me think of another consideration: if individual agency (in general, or the specific kind you’re trying to cultivate) is in fact dangerous, there may also be a selection effect in the opposite direction, where the wise intuitively shy away from being cultivated (without necessarily being able to articulate why).