The obvious disadvantage of a Task AGI is moral hazard—it may tempt the users in ways that a Sovereign would not. A Sovereign has moral hazard chiefly during the development phase, when the programmers and users are perhaps not yet in a position of special relative power. A Task AGI has ongoing moral hazard as it is used.
(My understanding is that task AGI = genie = Pivotal Tool.)
Wei Dai gives some examples of what could go wrong in this post:
For example, such AIs could give humans so much power so quickly or put them in such novel situations that their moral development can’t keep up, and their value systems no longer apply or give essentially random answers. AIs could give us new options that are irresistible to some parts of our motivational systems, like more powerful versions of video game and social media addiction. In the course of trying to figure out what we most want or like, they could in effect be searching for adversarial examples on our value functions. At our own request or in a sincere attempt to help us, they could generate philosophical or moral arguments that are wrong but extremely persuasive.
The underlying problem seems to be that when humans are in control over long-term outcomes, we are relying more on the humans to have good judgment, and this becomes increasingly a problem the more task-shaped the AI becomes.
I’m curious what your own thinking is (e.g. how would you fill out that row?).
I dont know how I’d fill out the row, since I don’t understand what is covered by the phrase “human safety”, or what assumptions are being made about the proliferation of the technology, or more specifically, the characteristics of the humans who do possess the tech.
I think I was imagining that the pivotal tool AI is developed by highly competent and safety-conscious humans who use it to perform a pivotal act (or series of pivotal acts) that effectively precludes the kind of issues mentioned in Wei’s quote there.
I think I was imagining that the pivotal tool AI is developed by highly competent and safety-conscious humans who use it to perform a pivotal act (or series of pivotal acts) that effectively precludes the kind of issues mentioned in Wei’s quote there.
Even if you make this assumption, it seems like the reliance on human safety does not go down. I think you’re thinking about something more like “how likely it is that lack of human safety becomes a problem” rather than “reliance on human safety”.
I couldn’t say without knowing more what “human safety” means here.
But here’s what I imagine an example pivotal command looking like: “Give me the ability to shut-down unsafe AI projects for the foreseeable future. Do this while minimizing disruption to the current world order / status quo. Interpret all of this in the way I intend.”
From the Task-directed AGI page on Arbital:
(My understanding is that task AGI = genie = Pivotal Tool.)
Wei Dai gives some examples of what could go wrong in this post:
The underlying problem seems to be that when humans are in control over long-term outcomes, we are relying more on the humans to have good judgment, and this becomes increasingly a problem the more task-shaped the AI becomes.
I’m curious what your own thinking is (e.g. how would you fill out that row?).
OK, I think that makes some sense.
I dont know how I’d fill out the row, since I don’t understand what is covered by the phrase “human safety”, or what assumptions are being made about the proliferation of the technology, or more specifically, the characteristics of the humans who do possess the tech.
I think I was imagining that the pivotal tool AI is developed by highly competent and safety-conscious humans who use it to perform a pivotal act (or series of pivotal acts) that effectively precludes the kind of issues mentioned in Wei’s quote there.
Even if you make this assumption, it seems like the reliance on human safety does not go down. I think you’re thinking about something more like “how likely it is that lack of human safety becomes a problem” rather than “reliance on human safety”.
I couldn’t say without knowing more what “human safety” means here.
But here’s what I imagine an example pivotal command looking like: “Give me the ability to shut-down unsafe AI projects for the foreseeable future. Do this while minimizing disruption to the current world order / status quo. Interpret all of this in the way I intend.”