I agree with basically your whole comment. But it doesn’t seem like you’re engaging with the frame I’m using. I’m trying to figure out how agentic the first AI that can do task X is, for a range of X (with the hope that the first AI that can do X is not very agentic, for some X that is a pivotal task). The claim that a highly agentic highly intelligent AI will likely do undesirable things when presented with task X is very little evidence about this, because a highly agentic highly intelligent AI will likely do undesirable things when presented with almost any task.
Thank you, that is clarifying, together with your note to Scott on ACX about wanting it to ‘lack a motivational system.’ I want to see if I have this right before I give another shot at answering your actual question.
So as I understand your question now, what you’re asking is, will the first AI that can do (ideally pivotal) task X be of Type A (general, planning, motivational, agentic, models world, intelligent, etc) or Type B (basic, pattern matching, narrow, dumb, domain specific, constrained, boxed, etc).
I almost accidentally labeled A/B as G/N there, and I’m not sure if that’s a fair labeling system and want to see how close the mapping is? (e.g. narrow AI and general AI as usually understood). If not, is there a key difference?
Instead of “dumb” or “narrow” I’d say “having a strong comparative advantage in X (versus humans)”. E.g. imagine watching evolution and asking “will the first animals that take over the world be able to solve have already solved the Riemann hypothesis”, and the answer is no because humans intelligence, while general, is still pointed more at civilisation-building-style tasks than mathematics.
Similarly, I don’t expect that any AI which can do a bunch of groundbreaking science to be “narrow” by our current standards, but I do hope that they have a strong comparative disadvantage at taking-over-world-style tasks, compared with doing-science-style tasks.
And that’s related to agency, because what we mean by agency is not far off “having a comparative advantage in taking-over-world style tasks”.
Now, I expect that at some point, this line of reasoning stops being useful, because your systems are general enough and agentic enough that, even if their comparative advantage isn’t taking over the world, they can pretty easily do that anyway. But the question is whether this line of reasoning is still useful for the first systems which can do pivotal task X. Eliezer thinks no, because he considers intelligence and agency to be very strongly linked. I’m less sure, because humans have been evolved really hard to be agentic, so I’d be surprised if you couldn’t beat us at a bunch of intellectual tasks while being much less agentic than us.
Side note: I meant “pattern-matching” as a gesture towards “the bit of general intelligence that doesn’t require agency” (although in hindsight I can see how this is confusing, I’ve just made an edit on the ACX comment).
“will the first animals that take over the world be able to solve the Riemann hypothesis”, and the answer is no because humans intelligence, while general, is still pointed more at civilisation-building-style tasks than mathematics.
Pardon the semantics, but I think the question you want to use here is “will the first animals that take over the world have already solved the Riemann hypothesis”. IMO humans do have the ability (“can”) to solve the Riemann hypothesis, and the point you’re making is just about the ordering in which we’ve done things.
I’m not sure this is the most useful way to think about it, either, because it includes the possibility that we didn’t solve the Riemann hypothesis first just because we weren’t really interested in it, not because of any kind of inherent difficulty to the problem or our suitability to solving it earlier. I think you’d want to consider:
alternative histories where solving the Riemann hypothesis was a (or the) main goal for humanity, and
alternative histories where world takeover was a (or the) main goal for humanity (our own actual history might be close enough)
and ask if we solve the Riemann hypothesis at earlier average times in worlds like 1 than we take over the world in worlds like 2.
We might also be able to imagine species that could take over the world but seem to have no hope of ever solving the Riemann hypothesis, and I think we want to distinguish that from just happening to not solve it first. Depending on what you mean by “taking over the world”, other animals may have done so before us, too, e.g. arthropods. Or even plants or other forms of life more or before any group of animals, even all animals combined.
I agree with basically your whole comment. But it doesn’t seem like you’re engaging with the frame I’m using. I’m trying to figure out how agentic the first AI that can do task X is, for a range of X (with the hope that the first AI that can do X is not very agentic, for some X that is a pivotal task). The claim that a highly agentic highly intelligent AI will likely do undesirable things when presented with task X is very little evidence about this, because a highly agentic highly intelligent AI will likely do undesirable things when presented with almost any task.
Thank you, that is clarifying, together with your note to Scott on ACX about wanting it to ‘lack a motivational system.’ I want to see if I have this right before I give another shot at answering your actual question.
So as I understand your question now, what you’re asking is, will the first AI that can do (ideally pivotal) task X be of Type A (general, planning, motivational, agentic, models world, intelligent, etc) or Type B (basic, pattern matching, narrow, dumb, domain specific, constrained, boxed, etc).
I almost accidentally labeled A/B as G/N there, and I’m not sure if that’s a fair labeling system and want to see how close the mapping is? (e.g. narrow AI and general AI as usually understood). If not, is there a key difference?
Instead of “dumb” or “narrow” I’d say “having a strong comparative advantage in X (versus humans)”. E.g. imagine watching evolution and asking “will the first animals that take over the world
be able to solvehave already solved the Riemann hypothesis”, and the answer is no because humans intelligence, while general, is still pointed more at civilisation-building-style tasks than mathematics.Similarly, I don’t expect that any AI which can do a bunch of groundbreaking science to be “narrow” by our current standards, but I do hope that they have a strong comparative disadvantage at taking-over-world-style tasks, compared with doing-science-style tasks.
And that’s related to agency, because what we mean by agency is not far off “having a comparative advantage in taking-over-world style tasks”.
Now, I expect that at some point, this line of reasoning stops being useful, because your systems are general enough and agentic enough that, even if their comparative advantage isn’t taking over the world, they can pretty easily do that anyway. But the question is whether this line of reasoning is still useful for the first systems which can do pivotal task X. Eliezer thinks no, because he considers intelligence and agency to be very strongly linked. I’m less sure, because humans have been evolved really hard to be agentic, so I’d be surprised if you couldn’t beat us at a bunch of intellectual tasks while being much less agentic than us.
Side note: I meant “pattern-matching” as a gesture towards “the bit of general intelligence that doesn’t require agency” (although in hindsight I can see how this is confusing, I’ve just made an edit on the ACX comment).
Pardon the semantics, but I think the question you want to use here is “will the first animals that take over the world have already solved the Riemann hypothesis”. IMO humans do have the ability (“can”) to solve the Riemann hypothesis, and the point you’re making is just about the ordering in which we’ve done things.
Yes, sorry, you’re right; edited.
I’m not sure this is the most useful way to think about it, either, because it includes the possibility that we didn’t solve the Riemann hypothesis first just because we weren’t really interested in it, not because of any kind of inherent difficulty to the problem or our suitability to solving it earlier. I think you’d want to consider:
alternative histories where solving the Riemann hypothesis was a (or the) main goal for humanity, and
alternative histories where world takeover was a (or the) main goal for humanity (our own actual history might be close enough)
and ask if we solve the Riemann hypothesis at earlier average times in worlds like 1 than we take over the world in worlds like 2.
We might also be able to imagine species that could take over the world but seem to have no hope of ever solving the Riemann hypothesis, and I think we want to distinguish that from just happening to not solve it first. Depending on what you mean by “taking over the world”, other animals may have done so before us, too, e.g. arthropods. Or even plants or other forms of life more or before any group of animals, even all animals combined.