Perhaps the right approach is to ask yourself “What is the smallest step I can take that has the lowest risk of not being a strict improvement over the current situation?”
You can get away with (in fact, strictly improve the algorithm by) using only the second of the two caution-optimisers there, so: “What is the smallest step I can take that has the lowest risk of not being a strict improvement over the current situation?”
Naturally when answering the question you will probably consider small steps—and in the unlikely even that a large step is safer, so much the better!
Assuming the person making the decision is perfect at estimating risk.
However since the likelihood is that it won’t be me creating the first ever AI, but rather that the person who does is reading this advice, I’d prefer to stipulate that they should go for small steps even if, in their opinion, there is some larger step that’s less risky.
The temptation exists for them to ask, as their first step, “AI of the ring, boost me to god-like wisdom and powers of thought”, but that has a number of drawbacks they may not think of. I’d rather my advice contain redundant precautions, as a safety feature.
“Of the steps of the smallest size that still advances things, which of those steps has the lowest risk?”
Another way to think about it is to take the steps (or give the AI orders) that can be effectively accomplished with the AI boosting itself by the smallest amount. Avoid, initially, making requests that to accomplish the AI will need to massively boost itself; if you can improve your decision making position just through requests that the AI can handle with its current capacity.
Assuming the person making the decision is perfect at estimating risk.
Or merely aware of the same potential weakness that you are. I’d be overwhelmingly uncomfortable with someone developing a super-intelligence without the awareness of their human limitations at risk assessment. (Incidentally ‘perfect’ risk assessment isn’t required. They make the most of whatever risk assessment ability they have either way.)
“Of the steps of the smallest size that still advances things, which of those steps has the lowest risk?”
I consider this a rather inferior solution—particularly in as much as it pretends to be minimizing two things. Since steps will almost inevitably be differentiated by size the assessment of lowest risks barely comes into play. An algorithm that almost never considers risk rather defeats the point.
If you must artificially circumvent the risk assessment algorithm—presumably to counter known biases—then perhaps make the “small steps” a question of satisficing rather than minimization.
Since steps will almost inevitably be differentiated by size the assessment of lowest risks barely comes into play. An algorithm that almost never considers risk rather defeats the point.
If you must artificially circumvent the risk assessment algorithm—presumably to counter known biases—then perhaps make the “small steps” a question of satisficing rather than minimization.
You can get away with (in fact, strictly improve the algorithm by) using only the second of the two caution-optimisers there, so: “What is the smallest step I can take that has the lowest risk of not being a strict improvement over the current situation?”
Naturally when answering the question you will probably consider small steps—and in the unlikely even that a large step is safer, so much the better!
Assuming the person making the decision is perfect at estimating risk.
However since the likelihood is that it won’t be me creating the first ever AI, but rather that the person who does is reading this advice, I’d prefer to stipulate that they should go for small steps even if, in their opinion, there is some larger step that’s less risky.
The temptation exists for them to ask, as their first step, “AI of the ring, boost me to god-like wisdom and powers of thought”, but that has a number of drawbacks they may not think of. I’d rather my advice contain redundant precautions, as a safety feature.
“Of the steps of the smallest size that still advances things, which of those steps has the lowest risk?”
Another way to think about it is to take the steps (or give the AI orders) that can be effectively accomplished with the AI boosting itself by the smallest amount. Avoid, initially, making requests that to accomplish the AI will need to massively boost itself; if you can improve your decision making position just through requests that the AI can handle with its current capacity.
Or merely aware of the same potential weakness that you are. I’d be overwhelmingly uncomfortable with someone developing a super-intelligence without the awareness of their human limitations at risk assessment. (Incidentally ‘perfect’ risk assessment isn’t required. They make the most of whatever risk assessment ability they have either way.)
I consider this a rather inferior solution—particularly in as much as it pretends to be minimizing two things. Since steps will almost inevitably be differentiated by size the assessment of lowest risks barely comes into play. An algorithm that almost never considers risk rather defeats the point.
If you must artificially circumvent the risk assessment algorithm—presumably to counter known biases—then perhaps make the “small steps” a question of satisficing rather than minimization.
Good point.
How would you word that?