It might also be worth comparing CAIS and “tool AI” to Paul Christiano’s IDA and the desiderata MIRI tends to talk about (task-directed AGI [1,2,3], mild optimization, limited AGI).
At a high level, I tend to think of Christiano and Drexler as both approaching alignment from very much the right angle, in that they’re (a) trying to break apart the vague idea of “AGI reasoning” into smaller parts, and (b) shooting for a system that won’t optimize harder (or more domain-generally) than we need for a given task. From conversations with Nate, one way I’d summarize MIRI-cluster disagreements with Christiano and Drexler’s proposals is that MIRI people don’t tend to think these proposals decompose cognitive work enough. Without a lot more decomposition/understanding, either the system as a whole won’t be capable enough, or it will be capable by virtue of atomic parts that are smart enough to be dangerous, where safety is a matter of how well we can open those black boxes.
In my experience people use “tool AI” to mean a bunch of different things, including things MIRI considers very important and useful (like “only works on a limited task, rather than putting any cognitive work into more general topics or trying to open-endedly optimize the future”) as well as ideas that don’t seem relevant or that obscure where the hard parts of the problem probably are.
“Ten years ago, everyone was talking about superintelligence, the singularity, the robot apocalypse. What happened?”
What is this referencing? I was only 10 years old in 2009 but I have a strong impression that AI risk gets a lot more attention now than it did then.
Also, what are the most salient differences between CAIS and the cluster of concepts Karnofsky and others were calling “Tool AI”?
It might also be worth comparing CAIS and “tool AI” to Paul Christiano’s IDA and the desiderata MIRI tends to talk about (task-directed AGI [1,2,3], mild optimization, limited AGI).
At a high level, I tend to think of Christiano and Drexler as both approaching alignment from very much the right angle, in that they’re (a) trying to break apart the vague idea of “AGI reasoning” into smaller parts, and (b) shooting for a system that won’t optimize harder (or more domain-generally) than we need for a given task. From conversations with Nate, one way I’d summarize MIRI-cluster disagreements with Christiano and Drexler’s proposals is that MIRI people don’t tend to think these proposals decompose cognitive work enough. Without a lot more decomposition/understanding, either the system as a whole won’t be capable enough, or it will be capable by virtue of atomic parts that are smart enough to be dangerous, where safety is a matter of how well we can open those black boxes.
In my experience people use “tool AI” to mean a bunch of different things, including things MIRI considers very important and useful (like “only works on a limited task, rather than putting any cognitive work into more general topics or trying to open-endedly optimize the future”) as well as ideas that don’t seem relevant or that obscure where the hard parts of the problem probably are.