I did a quick google search and couldn’t find much in the way of “lazy ai”, but it’s starting to seem more and more obvious as a strategy for alignment. Here’s the thought process:
-- Given almost any goal, an AI will want to obtain more power to be able to accomplish the goal, or to be more sure that it did accomplish the goal, or to prevent anyone from getting in its way of accomplishing the goal.
-- So even an AI with “human-level” intelligence will want to get more computing power, have safeguards against being shut down, and obtain resources like money and contacts.
-- Regular humans don’t usually act like this[1]. This has nothing to do with their intelligence (after all, we specified “human-level” intelligence), but instead with their willpower/personality/motivation. A normal human is more concerned with comfort, more afraid of hard work, and more complacent to do what everyone else is doing: not much. These traits together can be (somewhat harshly) described as Laziness.
So maybe one further safeguard to put on an AI is to make it lazy? No lazy person has ever attempted to take over the world, after all. But lazy people are often nice, and willing to, say, answer questions.
What would this look like in actuality? Give the AI harsh utility penalties for doing hard work, especially hard work right now. For example, at any given time, half of the AI’s utility function depends on not doing work above a certain threshold in the next hour (So if the AI does well, 50% of its utility function would be permanently completed in its first hour, and after its second hour, 75%, and after its third hour, 87.5%).
Is this an insanely bad idea? Would this cripple the AI of its most valuable tool? Are there other human characteristics that would possibly make sense to implant in an AI, like boredom or self-consciousness? Has anyone else written about this?
Most people don’t look at the world and say “what do I most want?” and “how can I best achieve it?” and then spend the next sixty years working at 100% capacity to gather resources, grow in power, and bend all of humanity to their iron will in hopes that this will give them a better chance at accomplishing their task.
Laziness in AI
I did a quick google search and couldn’t find much in the way of “lazy ai”, but it’s starting to seem more and more obvious as a strategy for alignment. Here’s the thought process:
-- Given almost any goal, an AI will want to obtain more power to be able to accomplish the goal, or to be more sure that it did accomplish the goal, or to prevent anyone from getting in its way of accomplishing the goal.
-- So even an AI with “human-level” intelligence will want to get more computing power, have safeguards against being shut down, and obtain resources like money and contacts.
-- Regular humans don’t usually act like this[1]. This has nothing to do with their intelligence (after all, we specified “human-level” intelligence), but instead with their willpower/personality/motivation. A normal human is more concerned with comfort, more afraid of hard work, and more complacent to do what everyone else is doing: not much. These traits together can be (somewhat harshly) described as Laziness.
So maybe one further safeguard to put on an AI is to make it lazy? No lazy person has ever attempted to take over the world, after all. But lazy people are often nice, and willing to, say, answer questions.
What would this look like in actuality? Give the AI harsh utility penalties for doing hard work, especially hard work right now. For example, at any given time, half of the AI’s utility function depends on not doing work above a certain threshold in the next hour (So if the AI does well, 50% of its utility function would be permanently completed in its first hour, and after its second hour, 75%, and after its third hour, 87.5%).
Is this an insanely bad idea? Would this cripple the AI of its most valuable tool? Are there other human characteristics that would possibly make sense to implant in an AI, like boredom or self-consciousness? Has anyone else written about this?
Most people don’t look at the world and say “what do I most want?” and “how can I best achieve it?” and then spend the next sixty years working at 100% capacity to gather resources, grow in power, and bend all of humanity to their iron will in hopes that this will give them a better chance at accomplishing their task.