It sidesteps most of the issues, by leaving humans in the loop for the ethically concerning aspects of applications of AGI. So yes, it “helps” with AI safety by making AI safety partially redundant.
Please explain more, especially regarding Katja’s point 5
One obvious problem with tools is that they maintain humans as a component in all goal-directed behavior. If humans are some combination of slow and rare compared to artificial intelligence, there may be strong pressure to automate all aspects of decisionmaking, i.e. use agents.
Is there any solution better than attempting to make and enforce a treaty against moving from tool to agent?
First of all, those two assumptions of humans being slow and rare compared to artificial intelligence are dubious. Humans are slow at some things, but fast at others. If the architecture of the AGI differs substantially from the way humans think, it is very likely that the AGI would not be very fast at doing some things humans find easy. And early human-level AGIs are likely to consume vast supercomputing resources; they’re not going to be cheap and plentiful.
But beyond that, the time frame for using tool AI may be very short, e.g. on the order of 10 years or so. There isn’t a danger of long-term instability here.
Capsulated tool AIs will be building blocks of a safety framework around AGI. Regulations for aircraft safety request full redundancy by independently developed control channels from different suppliers based on separate hardware. If an aircraft fails a few hundred people die. If safety control of a high capable AGI fails humankind is in danger.
Do you think tool AI is likely to help with AI safety in any way?
It sidesteps most of the issues, by leaving humans in the loop for the ethically concerning aspects of applications of AGI. So yes, it “helps” with AI safety by making AI safety partially redundant.
Please explain more, especially regarding Katja’s point 5
Is there any solution better than attempting to make and enforce a treaty against moving from tool to agent?
First of all, those two assumptions of humans being slow and rare compared to artificial intelligence are dubious. Humans are slow at some things, but fast at others. If the architecture of the AGI differs substantially from the way humans think, it is very likely that the AGI would not be very fast at doing some things humans find easy. And early human-level AGIs are likely to consume vast supercomputing resources; they’re not going to be cheap and plentiful.
But beyond that, the time frame for using tool AI may be very short, e.g. on the order of 10 years or so. There isn’t a danger of long-term instability here.
Yes. Tool AIs built solely for AGI safeguarding will become existential for FAI:
Capsulated tool AIs will be building blocks of a safety framework around AGI. Regulations for aircraft safety request full redundancy by independently developed control channels from different suppliers based on separate hardware. If an aircraft fails a few hundred people die. If safety control of a high capable AGI fails humankind is in danger.