if the AI is capable of potentially creating N QALYs over the course of six weeks then the relevant opportunity cost of delay is N QALYs? In which case it seems to follow that before we can really decide if waiting six weeks is worth it, we need to know what the EV of N is. Right?
Over three weeks, but yes: right.
If the AI makes dramatic changes to society on a very short time scale (such as uploading everyone’s brains to a virtual reality, then making 1000 copies of everyone) then N would be very very large.
If the AI makes minimal immediate changes in the short term (such as, for example, elliminating all nuclear bombs and putting in place measures to prevent hostile AIs from being developed—ie acting as insurance versus threats against the existance of the human species) then N might be zero.
What the expected value of N is, depends on what you think the likely comparative chance is of those two sorts of scenarios. But you can’t assume, in absense of knowledge, that the chances are 50:50.
And, like I said, you could use the first 10 minutes to find out what the AI predicts N would be. If you ask the AI “If I gave you the go ahead to do what you thought humanity would ask you to do, were it wiser but still human, give me the best answer you can fit into 10 minutes without taking any actions external to your sandbox to the questions: what would your plan of action over the next three weeks be, and what improvement in number of QALYs experienced by humans would you expect to see happen in that time?” and the AI answers “My plans are X, Y and Z and I’d expect N to be of an order of magnitude between 10 and 100 QALYs.” then you are free to take the nice slow route with a clear conscience.
Sure, agreed that if I have high confidence that letting the AI out of its sandbox doesn’t have too much of an upside in the short term (for example, if I ask it and that’s what it tells me and I trust its answer), then the opportunity costs of leaving it in its sandbox are easy to ignore.
Also agreed that N can potentially be very very large. In which case the opportunity costs of leaving it in its sandbox are hard to ignore.
Over three weeks, but yes: right.
If the AI makes dramatic changes to society on a very short time scale (such as uploading everyone’s brains to a virtual reality, then making 1000 copies of everyone) then N would be very very large.
If the AI makes minimal immediate changes in the short term (such as, for example, elliminating all nuclear bombs and putting in place measures to prevent hostile AIs from being developed—ie acting as insurance versus threats against the existance of the human species) then N might be zero.
What the expected value of N is, depends on what you think the likely comparative chance is of those two sorts of scenarios. But you can’t assume, in absense of knowledge, that the chances are 50:50.
And, like I said, you could use the first 10 minutes to find out what the AI predicts N would be. If you ask the AI “If I gave you the go ahead to do what you thought humanity would ask you to do, were it wiser but still human, give me the best answer you can fit into 10 minutes without taking any actions external to your sandbox to the questions: what would your plan of action over the next three weeks be, and what improvement in number of QALYs experienced by humans would you expect to see happen in that time?” and the AI answers “My plans are X, Y and Z and I’d expect N to be of an order of magnitude between 10 and 100 QALYs.” then you are free to take the nice slow route with a clear conscience.
Sure, agreed that if I have high confidence that letting the AI out of its sandbox doesn’t have too much of an upside in the short term (for example, if I ask it and that’s what it tells me and I trust its answer), then the opportunity costs of leaving it in its sandbox are easy to ignore.
Also agreed that N can potentially be very very large.
In which case the opportunity costs of leaving it in its sandbox are hard to ignore.