This post felt vague and confusing to me. What is meant by a “game board”—are you referring to the world’s geopolitical situation, or the governance structure of the United States, or the social dynamics of elites like politicians and researchers, or some kind of ethereum-esque crypto protocol, or internal company policies at Google and Microsoft, or US AI regulations, or what?
How do we get a “new board”? No matter what kind of change you want, you will have to get there starting from the current situation that the world is in right now.
Based on your linked post about consensus mechanisms, let’s say that you want to create some crypto-esque software that makes it easier to implement “futarchy”—prediction-market based government—and then get everyone to start using that new type of government to make wiser decisions, which will then help them govern the development of AI in a better, wiser way. Well, what would this crypto system look like? How would it avoid the pitfalls that have so far prevented the wide adoption of prediction markets in similar contexts? How would we get everyone to adopt this new protocol for important decisions—wouldn’t existing governments, companies, etc, be hesitant to relinquish their power?
Since there has to be some realistic path from “here” to “there”, it seems foolish to totally write off all existing AI companies and political actors (like the United States, United Nations, etc). It makes sense that it might be worth creating a totally-new system (like a futarchy governance platform, or whatever you are envisioning) from scratch, rather than trying to influence existing systems that can be hard to change. But for your idea to work, at some point the new system will have to influence / absorb / get adopted by, the existing big players. I think you should try think about how, concretely, that might happen. (Maybe when the USA sees how amazing the new system is, they states will request a constitutional convention and change to the new system? Maybe some other countries will adopt the system first, and this will help demonstrate its greatness to the USA? Or maybe prediction markets will start out getting legalized in the USA for commercial purposes, then slowly take on more and more governance functions? Or maybe the plan doesn’t rely on getting the new system enmeshed in existing governance structures, rather people will just start using this new system on their own, and eventually this will overtake existing governments, like how some Bitcoiners dream of the day when Bitcoin becomes the new reserve currency simply via lots of individual people switching?)
Some other thoughts on your post above:
In your linked post about “Consensus Mechanisms”, one of the things you are saying is that we should have a prediction market to help evaluate which approaches to alignment will be most likely to succeed. But wouldn’t the market be distorted by the fact that if everyone ends up dead, there is nobody left alive to collect their prediction-market winnings? (And thus no incentive to bet against sudden, unexpected failures of promising-seeming alignment strategies?) For more thinking about how one might set up a market to anticipate or mitigate X-risks, see Chaining Retroactive Funders to Borrow Against Unlikely Utopiasand X-Risk, Anthropics, & Peter Thiel’s Investment Thesis.
It seems to me that having a prediction market for different alignment approaches would be helpful, but would be VERY far from actually having a good plan to solve alignment. Consider the stock market—the stock market does a wonderful job identifying valuable companies, much better than soviet-style central planning systems have done historically. But stock markets are still frequently wrong and frequently have to change their minds (otherwise they wouldn’t swing around so much, and there would never be market “bubbles”). And even if stock markets were perfect, they only answer a small number of big questions, like “did this company’s recent announcement make it more or less valuable”… you can’t run a successful company on just the existence of the stock market alone, you also need employees, managers, executives, etc, doing work and making decisions in the normal way.
I feel like we share many of the same sentiments—the idea that we could improve the general level of societal / governmental decisionmaking using innovative ideas like better forms of voting, quadratic voting & funding, prediction markets, etc. Personally, I tried to sketch out one optimistic scenario (where these ideas get refined and widely adopted across the world, and then humanity is better able to deal with the alignment problem because we have better decisionmaking capability) with my entry in the Future of Life Institute’s “AI Worldbuilding” challenge. It imagines an admittedly utopian future history (from 2023-2045) that tries to show:
How changes could spread to many countries via competition to achieve faster growth than rivals, and via snowball effects of reform.
How the resulting, more “adequate” civilization could recognize the threat posed by alignment and coordinate to solve the problem.
I think the challenge is to try and get more and more concrete, in multiple different ways. My worldbuilding story, for instance, is still incredibly hand-wave-y about how these various innovative ideas would be implemented IRL (ie, I don’t explain in detail why prediction markets, land value taxes, improved voting systems, and other reform ideas should suddenly become very popular after decades of failing to catch on), and what exact pathways lead to the adoption of these new systems among major governments, and how exactly the improved laws and projects around AI alignment are implemented.
The goal of this kind of thinking, IMO, should be to eventually come up with some ideas that, even if they don’t solve the entire problem, are at least “shovel-ready” for implementation in the real world. Like, “Okay, maybe we could pass a ballot measure in California creating a state-run oversight commission that coordinates with the heads of top AI companies and includes the use of [some kind of innovative democratic inputs as gestured at here by OpenAI] to inform the value systems of all new large AI systems trained by California companies. Most AI companies are already in California to start with, so if it’s successful, this would hopefully set a standard that could eventually expand to a national or global level...”
Thank you for taking the time to read and critique this idea. I think this is very important, and I appreciate your thoughtful response.
Regarding how to get current systems to implement/agree to it, I don’t think that will be relevant longterm. The mechanisms current institutions use for control I don’t think can keep up with AI proliferation. I imagine most existing institutions will still exist, but won’t have the capacity to do much once AI really takes off. My guess is, if AI kills us, it will happen after a slow-motion coup. Not any kind of intentional coup by AIs, but from humans just coup’ing themselves because AIs will just be more useful. My idea wouldn’t be removing or replacing any institutions, but they just wouldn’t be extremely relevant to it. Some governments might try to actively ban use of it, but these would probably be fleeting, if the network actually was superior in collective intelligence to any individual AI. If it made work economically more useful for them, they would want to use it. It doesn’t involve removing them, or doing much to directly interfere with things they are doing. Think of it this way, recommendation algorithms on social media have an enormous influence on society, institutions, etc. Some try to ban or control them, but most can still access them if they want to, and no entity really controls them. But no one incorporates the “will of twitter” into their constitution.
The game board isn’t any of the things you mention. All the things you mention I don’t think have the capacity to do much to change the board. The current board is fundamentally adversarial, where interacting with it increases the power of other players. We’ve seen this with OpenAI, Anthropic, etc. The new board would be cooperative, at least at a higher level. How do we make the new board more useful than the current one? My best guess would be economic advantage of decentralized compute. We’ve seen how fast the OpenSource community has been able to make progress. And we’ve seen how a huge amount of compute gets used doing things like mining bitcoin, even though the compute is wasted on solving math puzzles. Contributing decentralized compute to a collective network could actually have economic value, and I imagine this will happen one way or another, but my concern is it’ll end up being for the worse if people aren’t actively trying to create a better system. A decentralized network with no safeguards would probably be much worse than anything a major AI company could create.
“But wouldn’t the market be distorted by the fact that if everyone ends up dead, there is nobody left alive to collect their prediction-market winnings?”
This seems to be going back to the “one critical shot” approach which I think is a terrible idea that won’t possibly work in the real world under any circumstances. This would be a progression overtime, not a case where an AI goes supernova overnight. This might require slower takeoffs, or at least no foom scenarios. Making a new board that isn’t adversarial might mitigate the potential of foom. What I proposed was my first naive approach, and I’ve since thought that maybe it’s the collective intelligence of the system that should be increasing, not a singleton AI being trained at the center. Most members in that collective intelligence would initially be humans, and slowly more and more AIs would be a more and more powerful part of the system. I’m not sure here, though. Maybe there’s some third option where there’s a foundational model at the lowest layer of the network, but it isn’t a singular AI in the normal sense. I imagine a singular AI at the center could give rise to agency, and probably break the whole thing.
“It seems to me that having a prediction market for different alignment approaches would be helpful, but would be VERY far from actually having a good plan to solve alignment.”
I agree here. They’d only be good at maybe predicting the next iteration of progress, not a fully scalable solution.
“I feel like we share many of the same sentiments—the idea that we could improve the general level of societal / governmental decision-making using innovative ideas like better forms of voting, quadratic voting & funding, prediction markets, etc”
This would be great, but my guess is they would probably progress too slowly to be useful. Mechanism design that has to deal with currently existing institutions I don’t think will happen quickly enough. Technically-enforced design might.
I love the idea of shovel-ready strategies, and think we need to be prepared in the event of a crisis. My issue is even most good strategies seem to just deal with large companies, and don’t know how to deal with the likelihood that such power will fall into more and more actors.
[Cross-posting my comment from the EA Forum]
This post felt vague and confusing to me. What is meant by a “game board”—are you referring to the world’s geopolitical situation, or the governance structure of the United States, or the social dynamics of elites like politicians and researchers, or some kind of ethereum-esque crypto protocol, or internal company policies at Google and Microsoft, or US AI regulations, or what?
How do we get a “new board”? No matter what kind of change you want, you will have to get there starting from the current situation that the world is in right now.
Based on your linked post about consensus mechanisms, let’s say that you want to create some crypto-esque software that makes it easier to implement “futarchy”—prediction-market based government—and then get everyone to start using that new type of government to make wiser decisions, which will then help them govern the development of AI in a better, wiser way. Well, what would this crypto system look like? How would it avoid the pitfalls that have so far prevented the wide adoption of prediction markets in similar contexts? How would we get everyone to adopt this new protocol for important decisions—wouldn’t existing governments, companies, etc, be hesitant to relinquish their power?
Since there has to be some realistic path from “here” to “there”, it seems foolish to totally write off all existing AI companies and political actors (like the United States, United Nations, etc). It makes sense that it might be worth creating a totally-new system (like a futarchy governance platform, or whatever you are envisioning) from scratch, rather than trying to influence existing systems that can be hard to change. But for your idea to work, at some point the new system will have to influence / absorb / get adopted by, the existing big players. I think you should try think about how, concretely, that might happen. (Maybe when the USA sees how amazing the new system is, they states will request a constitutional convention and change to the new system? Maybe some other countries will adopt the system first, and this will help demonstrate its greatness to the USA? Or maybe prediction markets will start out getting legalized in the USA for commercial purposes, then slowly take on more and more governance functions? Or maybe the plan doesn’t rely on getting the new system enmeshed in existing governance structures, rather people will just start using this new system on their own, and eventually this will overtake existing governments, like how some Bitcoiners dream of the day when Bitcoin becomes the new reserve currency simply via lots of individual people switching?)
Some other thoughts on your post above:
In your linked post about “Consensus Mechanisms”, one of the things you are saying is that we should have a prediction market to help evaluate which approaches to alignment will be most likely to succeed. But wouldn’t the market be distorted by the fact that if everyone ends up dead, there is nobody left alive to collect their prediction-market winnings? (And thus no incentive to bet against sudden, unexpected failures of promising-seeming alignment strategies?) For more thinking about how one might set up a market to anticipate or mitigate X-risks, see Chaining Retroactive Funders to Borrow Against Unlikely Utopias and X-Risk, Anthropics, & Peter Thiel’s Investment Thesis.
It seems to me that having a prediction market for different alignment approaches would be helpful, but would be VERY far from actually having a good plan to solve alignment. Consider the stock market—the stock market does a wonderful job identifying valuable companies, much better than soviet-style central planning systems have done historically. But stock markets are still frequently wrong and frequently have to change their minds (otherwise they wouldn’t swing around so much, and there would never be market “bubbles”). And even if stock markets were perfect, they only answer a small number of big questions, like “did this company’s recent announcement make it more or less valuable”… you can’t run a successful company on just the existence of the stock market alone, you also need employees, managers, executives, etc, doing work and making decisions in the normal way.
I feel like we share many of the same sentiments—the idea that we could improve the general level of societal / governmental decisionmaking using innovative ideas like better forms of voting, quadratic voting & funding, prediction markets, etc. Personally, I tried to sketch out one optimistic scenario (where these ideas get refined and widely adopted across the world, and then humanity is better able to deal with the alignment problem because we have better decisionmaking capability) with my entry in the Future of Life Institute’s “AI Worldbuilding” challenge. It imagines an admittedly utopian future history (from 2023-2045) that tries to show:
How we might make big improvements to decisionmaking via mechanisms like futarchy and liquid democracy, enhanced by Elicit-like research/analysis tools.
How changes could spread to many countries via competition to achieve faster growth than rivals, and via snowball effects of reform.
How the resulting, more “adequate” civilization could recognize the threat posed by alignment and coordinate to solve the problem.
I think the challenge is to try and get more and more concrete, in multiple different ways. My worldbuilding story, for instance, is still incredibly hand-wave-y about how these various innovative ideas would be implemented IRL (ie, I don’t explain in detail why prediction markets, land value taxes, improved voting systems, and other reform ideas should suddenly become very popular after decades of failing to catch on), and what exact pathways lead to the adoption of these new systems among major governments, and how exactly the improved laws and projects around AI alignment are implemented.
The goal of this kind of thinking, IMO, should be to eventually come up with some ideas that, even if they don’t solve the entire problem, are at least “shovel-ready” for implementation in the real world. Like, “Okay, maybe we could pass a ballot measure in California creating a state-run oversight commission that coordinates with the heads of top AI companies and includes the use of [some kind of innovative democratic inputs as gestured at here by OpenAI] to inform the value systems of all new large AI systems trained by California companies. Most AI companies are already in California to start with, so if it’s successful, this would hopefully set a standard that could eventually expand to a national or global level...”
[crossposting my reply]
Thank you for taking the time to read and critique this idea. I think this is very important, and I appreciate your thoughtful response.
Regarding how to get current systems to implement/agree to it, I don’t think that will be relevant longterm. The mechanisms current institutions use for control I don’t think can keep up with AI proliferation. I imagine most existing institutions will still exist, but won’t have the capacity to do much once AI really takes off. My guess is, if AI kills us, it will happen after a slow-motion coup. Not any kind of intentional coup by AIs, but from humans just coup’ing themselves because AIs will just be more useful. My idea wouldn’t be removing or replacing any institutions, but they just wouldn’t be extremely relevant to it. Some governments might try to actively ban use of it, but these would probably be fleeting, if the network actually was superior in collective intelligence to any individual AI. If it made work economically more useful for them, they would want to use it. It doesn’t involve removing them, or doing much to directly interfere with things they are doing. Think of it this way, recommendation algorithms on social media have an enormous influence on society, institutions, etc. Some try to ban or control them, but most can still access them if they want to, and no entity really controls them. But no one incorporates the “will of twitter” into their constitution.
The game board isn’t any of the things you mention. All the things you mention I don’t think have the capacity to do much to change the board. The current board is fundamentally adversarial, where interacting with it increases the power of other players. We’ve seen this with OpenAI, Anthropic, etc. The new board would be cooperative, at least at a higher level. How do we make the new board more useful than the current one? My best guess would be economic advantage of decentralized compute. We’ve seen how fast the OpenSource community has been able to make progress. And we’ve seen how a huge amount of compute gets used doing things like mining bitcoin, even though the compute is wasted on solving math puzzles. Contributing decentralized compute to a collective network could actually have economic value, and I imagine this will happen one way or another, but my concern is it’ll end up being for the worse if people aren’t actively trying to create a better system. A decentralized network with no safeguards would probably be much worse than anything a major AI company could create.
“But wouldn’t the market be distorted by the fact that if everyone ends up dead, there is nobody left alive to collect their prediction-market winnings?”
This seems to be going back to the “one critical shot” approach which I think is a terrible idea that won’t possibly work in the real world under any circumstances. This would be a progression overtime, not a case where an AI goes supernova overnight. This might require slower takeoffs, or at least no foom scenarios. Making a new board that isn’t adversarial might mitigate the potential of foom. What I proposed was my first naive approach, and I’ve since thought that maybe it’s the collective intelligence of the system that should be increasing, not a singleton AI being trained at the center. Most members in that collective intelligence would initially be humans, and slowly more and more AIs would be a more and more powerful part of the system. I’m not sure here, though. Maybe there’s some third option where there’s a foundational model at the lowest layer of the network, but it isn’t a singular AI in the normal sense. I imagine a singular AI at the center could give rise to agency, and probably break the whole thing.
“It seems to me that having a prediction market for different alignment approaches would be helpful, but would be VERY far from actually having a good plan to solve alignment.”
I agree here. They’d only be good at maybe predicting the next iteration of progress, not a fully scalable solution.
“I feel like we share many of the same sentiments—the idea that we could improve the general level of societal / governmental decision-making using innovative ideas like better forms of voting, quadratic voting & funding, prediction markets, etc”
This would be great, but my guess is they would probably progress too slowly to be useful. Mechanism design that has to deal with currently existing institutions I don’t think will happen quickly enough. Technically-enforced design might.
I love the idea of shovel-ready strategies, and think we need to be prepared in the event of a crisis. My issue is even most good strategies seem to just deal with large companies, and don’t know how to deal with the likelihood that such power will fall into more and more actors.