Do you think the AI-assisted humanity is in a worse situation than humanity is today? If we are metaphilosophically competent enough that we can make progress, why won’t we remain metaphilosphically competent enough once we have powerful AI assistants?
Depends on who “we” is. If the first team that builds an AGI achieves a singleton, then I think the outcome is good if and only if the people on that team are metaphilosophically competent enough, and don’t have that competence corrupted by AI’s.
In your hypothetical in particular, why do the people in the future—who have had radically more subjective time to consider this problem than we have, have apparently augmented their intelligence, and have exchanged massive amounts of knowledge with each other—make decisions so much worse than those that you or I would make today?
If the team in the hypothetical is less metaphilosophically competent than we are, or have their metaphilosophical competence corrupted by the AI, then their decisions would turn out worse.
I would say so. Another fairly significant component is my model that humanity makes updates by having enough powerful people paying enough attention to reasonable people, enough other powerful people paying attention to those powerful people, and with everyone else roughly copying the beliefs of the powerful people. So, good memes --> reasonable people --> some powerful people --> other powerful people --> everyone else.
AI would make some group of people far more powerful than the rest, which screws up the chain if that group don’t pay much attention to reasonable people. In that case, they (and the world) might just never become reasonable. I think this would happen if ISIS took control, for example.
Other than disagreeing, my main complaint is that this doesn’t seem to have much to do with AI. Couldn’t you tell exactly the same story about human civilization proceeding along its normal development trajectory, never building an AI, but gradually uncovering new technologies and becoming smarter?
I would indeed expect this by default, particularly if one group with one ideology attains decisive control over the world. But if we somehow manage to avoid that (which seems unlikely to me, given the nature of technological progress), I feel much more optimistic about metaphilosophy continuing to progress and propagate throughout humanity relatively quickly.
When I talk about alignment I’m definitely talking about a narrower thing than you. In particular, any difficulties that would exist with or without AI *aren’t* part of what I mean by AI alignment.
Do you think the AI-assisted humanity is in a worse situation than humanity is today?
Lots of people involved in thinking about AI seem to be in a zero sum, winner-take-all mode. E.g. Macron.
I think there will be significant founder effects from the strategies of the people that create AGI. The development of AGI will be used as an example of what types of strategies win in the future during technological development. Deliberation may tell people that there are better equilibrium. But empiricism may tell people that they are too hard to reach.
Currently the positive-sum norm of free exchange of scientific knowledge is being tested. For good reasons, perhaps? But I worry for the world if lack of sharing of knowledge gets cemented as the new norm. It will lead to more arms races and make coordination harder on the important problems. So if the creation of AI leads to the destruction of science as we know it, I think we might be in a worse position.
I, perhaps naively, don’t think it has to be that way.
Depends on who “we” is. If the first team that builds an AGI achieves a singleton, then I think the outcome is good if and only if the people on that team are metaphilosophically competent enough, and don’t have that competence corrupted by AI’s.
If the team in the hypothetical is less metaphilosophically competent than we are, or have their metaphilosophical competence corrupted by the AI, then their decisions would turn out worse.
I’m reminded of the lengthy discussion you had with Wei Dai back in the day. I share his picture of which scenarios will get us something close to optimal, his beliefs that philosophical ignorance might persist indefinitely, his skepticism about the robustness of human reflection, and his skepticism that human values will robustly converge upon reflection.
I would say so. Another fairly significant component is my model that humanity makes updates by having enough powerful people paying enough attention to reasonable people, enough other powerful people paying attention to those powerful people, and with everyone else roughly copying the beliefs of the powerful people. So, good memes --> reasonable people --> some powerful people --> other powerful people --> everyone else.
AI would make some group of people far more powerful than the rest, which screws up the chain if that group don’t pay much attention to reasonable people. In that case, they (and the world) might just never become reasonable. I think this would happen if ISIS took control, for example.
I would indeed expect this by default, particularly if one group with one ideology attains decisive control over the world. But if we somehow manage to avoid that (which seems unlikely to me, given the nature of technological progress), I feel much more optimistic about metaphilosophy continuing to progress and propagate throughout humanity relatively quickly.
When I talk about alignment I’m definitely talking about a narrower thing than you. In particular, any difficulties that would exist with or without AI *aren’t* part of what I mean by AI alignment.
Lots of people involved in thinking about AI seem to be in a zero sum, winner-take-all mode. E.g. Macron.
I think there will be significant founder effects from the strategies of the people that create AGI. The development of AGI will be used as an example of what types of strategies win in the future during technological development. Deliberation may tell people that there are better equilibrium. But empiricism may tell people that they are too hard to reach.
Currently the positive-sum norm of free exchange of scientific knowledge is being tested. For good reasons, perhaps? But I worry for the world if lack of sharing of knowledge gets cemented as the new norm. It will lead to more arms races and make coordination harder on the important problems. So if the creation of AI leads to the destruction of science as we know it, I think we might be in a worse position.
I, perhaps naively, don’t think it has to be that way.