(Ok so I admit the word “Satisf-AI” is weird, I was intending for it to be a play off of “Satisfy” and “AI” and for it to be pronounced like “Satisfy”)
I think Holden Karenofsky’s PASTA is a good way to frame the discussion of transformative AI. It resonates with something I have been turning over for a while now. The AI safety community is typically focused on the question “how do we build AI that is safe?”. But I think a better question is “how do we get what we want from AI in a safe way?”.
In other words, rather than focus on building a superintelligence which is safe, can we find a way to get the benefits of a superintelligence without actually building one?
To make this more concrete, lets consider the major benefits we expect from an AI:
Innovation: We want AI’s to generate new inventions and insights that we can use.
Complete Tasks: We want AI’s to perform specific jobs at scale which are too menial, dangerous, or complex for people to do.
Make Decisions: We want AI’s to advise our decisions and to make decisions on our behalf.
Right now there is a strong push to create AI that can meet all of these goals. People worry that the competitive pressures associated with this effort might create a harmful race-to-the-bottom on AI safety.
But what if we could remove the draw of AI? What if we could find a way to get everything we want on the list without having to use superintelligence?
This would significantly reduce the competitive pressures that might lead to unsafe AI and it would give AI safety researchers a crucial resource: time.
But how do we get the benefits of AI, without AI?
I think many of the hopes we have for AI are achievable with much safer approaches. To be clear, these approaches might still involve modern-day machine learning, but would not constitute a general intelligence. These more narrowly focused artificial intelligence systems could satisfy our desire for a superintelligence, hence the name “satisf-AI”.
This “narrow AI” idea isn’t new. Drexler outlines something similar in his CAIS model while Gwern warns against a false sense of security that these Tool AI’s might bring. I will return to this last concern in a later section.
Satisfactory AI in Each Domain
Let’s consider what it would take to satisfy our needs in each domain without resorting to superintelligence.
Innovation: One could imagine automating much of science, performing experiments on a massive scale. Simple, safe algorithms could be designed to decide which subsequent experiments would be most valuable and human researchers could be augmented with better tools for thought to oversee and direct this search.
Complete Tasks: Rather than automate the specific tasks that humans do, it seems that we could redesign many jobs to be more amenable to automation. The restaurant Cala has had success with this approach. Robots are already quite sophisticated, and simple algorithms designed for simple tasks are much more likely to be safe than algorithms designed to learn arbitrary tasks. In addition, policies which reduce the cost of human labor and technologies which increase people’s productivity can reduce the pressure to automate tasks using sophisticated, potentially dangerous agents. In general, AI and automation are at odds.
Make Decisions: This is perhaps the most difficult field to advance without using dangerous AI. However, tools like decision markets can displace the use of AI for sufficiently “large” decisions. Adapting an “internet points” approach to decision markets similar to Metaculus might be a good way to apply automated decision making to smaller, more personal decisions. We may also find ways to design transparent heuristics which achieve satisfactory performance. Many companies already employ recommendation algorithms and there is research focused on aligning these systems. Fortunately, automating many tasks means that people can spend more time auditing AI-based decision-making. However, there are still domains where competitive pressures may lead to the creation of highly sophisticated, potentially dangerous agents. This is the focus of the next section.
Won’t SatisfAI Still be Dangerous?
For the innovation and task domains, it seems plausible that we can achieve high performance using simple, safe approaches. Work in automating these fields in a safe way can improve AI safety and reduce pressures to create a superintelligence.
But this doesn’t eliminate the risk. Namely, there doesn’t seem to be a good way to have AI make good decisions for us without leveraging powerful artificial agents. Using safer tools like heuristics and decision markets can reduce, but not eliminate, the need for sophisticated decisionmaking.
In Tool AI’s Want to Become Agent AI’s Gwern makes the case that any intelligence we design as a “tool” for human users will inevitably become an “agent” acting with little supervision. Essentially, economic incentives exist to remove people from the loop, and allow AI’s to act on their own. This means that regardless of our intent to use AI as a tool, we still have to solve the hard problem of alignment.
This bears directly on the possibility of building satisfactory AI. We cannot feel safe creating simple AI systems intended for narrow uses.
But I think this point frames the issue incorrectly. We don’t get to decide whether or not to create AI. Our choice centers on which AI’s we decide to create. I don’t see a good way to avoid the creation of an AI without much worse side effects.
But it is possible to create agents which are much safer simply by limiting their complexity. For example, there are plenty of animals, insects, and computer programs possessing something similar to agency, but none of them currently pose a threat to society. This means that there is some level of intelligence which is safe. In fact, an AI with precisely human level intelligence (which had no way to multiply, coordinate, or modify itself) would be safe as well.
How can we encourage the creation of simple, narrow AI? Taboos against creating sophisticated agents combined institutional policies that directed research toward narrow AI might be able to counteract the competitive pressure to create dangerous agents. In addition, making it easy for people to develop high performance, safe AI’s could eliminate the need to build more sophisticated ones.
Focusing on creating simpler AI’s has benefits for the AI safety community. For one, these AI’s can serve as testing grounds for creating safe, general purpose intelligence. Narrow AI’s can also reduce the economic incentive to create broadly useful agents. This can redirect researchers towards safer areas of inquiry and reduce race-to-the-bottom dynamics.
These narrow-AI’s can still be dangerous. If they become more complex in pursuit of better performance, it is possible that they grow into the superintelligences we were warned about. However, in many cases it may not be valuable for narrow AI’s to become more complex. This complexity comes at significant cost, and it may simply not be worth it for an AI to use vastly more computation to generate better movie recommendations.
Of course, alignment research will still be necessary in the fields which are both highly valuable and benefit from using strong AI.
Conclusion
Put another way, SatisfAI is about squeezing as much value as we can from a given level of intelligence before developing smarter systems. Finding ways to counteract the incentives to create dangerous AI is pivotal to achieving this.
SatisfAI should constitute a major goal for the artificial intelligence field. Though it does not eliminate the dangers of AI, it can help to reduce and delay these dangers. In addition to continued work on alignment, building satisfactory AI can alleviate the competitive pressures which undermine safety efforts and provide practical problems for researchers to solve. Though it may be tempting to raise the performance of these systems by increasing their complexity, a strong cultural and institutional bias should exist to limit this behavior.
If successful, SatifAI might make creating AGI a purely academic exercise, one that we have the time to prepare for.
Satisf-AI: A Route to Reducing Risks From AI
(Ok so I admit the word “Satisf-AI” is weird, I was intending for it to be a play off of “Satisfy” and “AI” and for it to be pronounced like “Satisfy”)
I think Holden Karenofsky’s PASTA is a good way to frame the discussion of transformative AI. It resonates with something I have been turning over for a while now. The AI safety community is typically focused on the question “how do we build AI that is safe?”. But I think a better question is “how do we get what we want from AI in a safe way?”.
In other words, rather than focus on building a superintelligence which is safe, can we find a way to get the benefits of a superintelligence without actually building one?
To make this more concrete, lets consider the major benefits we expect from an AI:
Innovation: We want AI’s to generate new inventions and insights that we can use.
Complete Tasks: We want AI’s to perform specific jobs at scale which are too menial, dangerous, or complex for people to do.
Make Decisions: We want AI’s to advise our decisions and to make decisions on our behalf.
Right now there is a strong push to create AI that can meet all of these goals. People worry that the competitive pressures associated with this effort might create a harmful race-to-the-bottom on AI safety.
But what if we could remove the draw of AI? What if we could find a way to get everything we want on the list without having to use superintelligence?
This would significantly reduce the competitive pressures that might lead to unsafe AI and it would give AI safety researchers a crucial resource: time.
But how do we get the benefits of AI, without AI?
I think many of the hopes we have for AI are achievable with much safer approaches. To be clear, these approaches might still involve modern-day machine learning, but would not constitute a general intelligence. These more narrowly focused artificial intelligence systems could satisfy our desire for a superintelligence, hence the name “satisf-AI”.
This “narrow AI” idea isn’t new. Drexler outlines something similar in his CAIS model while Gwern warns against a false sense of security that these Tool AI’s might bring. I will return to this last concern in a later section.
Satisfactory AI in Each Domain
Let’s consider what it would take to satisfy our needs in each domain without resorting to superintelligence.
Innovation: One could imagine automating much of science, performing experiments on a massive scale. Simple, safe algorithms could be designed to decide which subsequent experiments would be most valuable and human researchers could be augmented with better tools for thought to oversee and direct this search.
Complete Tasks: Rather than automate the specific tasks that humans do, it seems that we could redesign many jobs to be more amenable to automation. The restaurant Cala has had success with this approach. Robots are already quite sophisticated, and simple algorithms designed for simple tasks are much more likely to be safe than algorithms designed to learn arbitrary tasks. In addition, policies which reduce the cost of human labor and technologies which increase people’s productivity can reduce the pressure to automate tasks using sophisticated, potentially dangerous agents. In general, AI and automation are at odds.
Make Decisions: This is perhaps the most difficult field to advance without using dangerous AI. However, tools like decision markets can displace the use of AI for sufficiently “large” decisions. Adapting an “internet points” approach to decision markets similar to Metaculus might be a good way to apply automated decision making to smaller, more personal decisions. We may also find ways to design transparent heuristics which achieve satisfactory performance. Many companies already employ recommendation algorithms and there is research focused on aligning these systems. Fortunately, automating many tasks means that people can spend more time auditing AI-based decision-making. However, there are still domains where competitive pressures may lead to the creation of highly sophisticated, potentially dangerous agents. This is the focus of the next section.
Won’t SatisfAI Still be Dangerous?
For the innovation and task domains, it seems plausible that we can achieve high performance using simple, safe approaches. Work in automating these fields in a safe way can improve AI safety and reduce pressures to create a superintelligence.
But this doesn’t eliminate the risk. Namely, there doesn’t seem to be a good way to have AI make good decisions for us without leveraging powerful artificial agents. Using safer tools like heuristics and decision markets can reduce, but not eliminate, the need for sophisticated decisionmaking.
In Tool AI’s Want to Become Agent AI’s Gwern makes the case that any intelligence we design as a “tool” for human users will inevitably become an “agent” acting with little supervision. Essentially, economic incentives exist to remove people from the loop, and allow AI’s to act on their own. This means that regardless of our intent to use AI as a tool, we still have to solve the hard problem of alignment.
This bears directly on the possibility of building satisfactory AI. We cannot feel safe creating simple AI systems intended for narrow uses.
But I think this point frames the issue incorrectly. We don’t get to decide whether or not to create AI. Our choice centers on which AI’s we decide to create. I don’t see a good way to avoid the creation of an AI without much worse side effects.
But it is possible to create agents which are much safer simply by limiting their complexity. For example, there are plenty of animals, insects, and computer programs possessing something similar to agency, but none of them currently pose a threat to society. This means that there is some level of intelligence which is safe. In fact, an AI with precisely human level intelligence (which had no way to multiply, coordinate, or modify itself) would be safe as well.
How can we encourage the creation of simple, narrow AI? Taboos against creating sophisticated agents combined institutional policies that directed research toward narrow AI might be able to counteract the competitive pressure to create dangerous agents. In addition, making it easy for people to develop high performance, safe AI’s could eliminate the need to build more sophisticated ones.
Focusing on creating simpler AI’s has benefits for the AI safety community. For one, these AI’s can serve as testing grounds for creating safe, general purpose intelligence. Narrow AI’s can also reduce the economic incentive to create broadly useful agents. This can redirect researchers towards safer areas of inquiry and reduce race-to-the-bottom dynamics.
These narrow-AI’s can still be dangerous. If they become more complex in pursuit of better performance, it is possible that they grow into the superintelligences we were warned about. However, in many cases it may not be valuable for narrow AI’s to become more complex. This complexity comes at significant cost, and it may simply not be worth it for an AI to use vastly more computation to generate better movie recommendations.
Of course, alignment research will still be necessary in the fields which are both highly valuable and benefit from using strong AI.
Conclusion
Put another way, SatisfAI is about squeezing as much value as we can from a given level of intelligence before developing smarter systems. Finding ways to counteract the incentives to create dangerous AI is pivotal to achieving this.
SatisfAI should constitute a major goal for the artificial intelligence field. Though it does not eliminate the dangers of AI, it can help to reduce and delay these dangers. In addition to continued work on alignment, building satisfactory AI can alleviate the competitive pressures which undermine safety efforts and provide practical problems for researchers to solve. Though it may be tempting to raise the performance of these systems by increasing their complexity, a strong cultural and institutional bias should exist to limit this behavior.
If successful, SatifAI might make creating AGI a purely academic exercise, one that we have the time to prepare for.