Building the box reduces your ability to predict anything taking place outside the box. Even if the box can be sealed perfectly until the end of time without killing you (which would in itself be a surprise to anyone who knows thermodynamics), cutting off access to compilations of medical research reduces your ability to predict your own physiological reactions. Same goes for screwing with your brain functions.
I do not think you should be as confident as you are that your system is bulletproof. You have already had to elaborate and clarify and correct numerous times to rule out various kinds of paperclipping failures—all it takes is one elaboration or clarification or correction forgotten to allow for a new one, attacking the problem this way.
That’s our problem right there: you’re trying to persuade me to abandon a position I don’t actually hold. I agree that an AI based strictly on a survey of all historical humans would have negligible chance of success, simply because a literal survey is infeasible and any straightforward approximation of it would introduce unacceptable errors.
For everyone else, it was a chance to identify flaws in a proposition. No such thing as too much practice there. For me, it was a chance to experience firsthand the thought processes involved in defending a flawed proposition, necessary practice for recognizing other such flawed beliefs I might be holding; I had no religious upbringing to escape, so that common reference point is missing.
Furthermore, I knew from the outset that such a survey wouldn’t be practical, but I’ve been suspicious of CEV for a while now. It seems like it would be too hard to formalize, and at the same time, even if successful, too far removed from what people spend most of their time caring about. I couldn’t be satisfied that there wasn’t a better way to do it until I’d tried to find such a way myself.
It’s polite to give some signal that you’re playing devil’s advocate if you know you’re making weak arguments.
I couldn’t be satisfied that there wasn’t a better way to do it until I’d tried to find such a way myself.
This is not a sufficient condition for establishing the optimality of CEV. Indeed, I’m not sure there isn’t a better way (nor even that CEV is workable), just that I have at present no candidates for one.
I apologize. I thought I had discharged the devil’s-advocacy-signaling obligation by ending my original post on the subject with a request to be proved wrong.
I agree that personal satisfaction with CEV isn’t a sufficient condition for it being safe. For that matter, having proposed and briefly defended this one alternative isn’t really sufficient for my personal satisfaction in either CEV’s adequacy or the lack of a better option. But we have to start somewhere, and if someone did come up with a better alternative to CEV, I’d want to make sure that it got fair consideration.
Building the box reduces your ability to predict anything taking place outside the box. Even if the box can be sealed perfectly until the end of time without killing you (which would in itself be a surprise to anyone who knows thermodynamics), cutting off access to compilations of medical research reduces your ability to predict your own physiological reactions. Same goes for screwing with your brain functions.
I do not think you should be as confident as you are that your system is bulletproof. You have already had to elaborate and clarify and correct numerous times to rule out various kinds of paperclipping failures—all it takes is one elaboration or clarification or correction forgotten to allow for a new one, attacking the problem this way.
How confident do you think I am that my plan is bulletproof?
Given that you asked me the question, I reckon you give it somewhere between 1:100 and 2:1 odds of succeeding. I reckon the odds are negligible.
That’s our problem right there: you’re trying to persuade me to abandon a position I don’t actually hold. I agree that an AI based strictly on a survey of all historical humans would have negligible chance of success, simply because a literal survey is infeasible and any straightforward approximation of it would introduce unacceptable errors.
...why are you defending it, then? I don’t even see that thinking along those lines is helpful.
For everyone else, it was a chance to identify flaws in a proposition. No such thing as too much practice there. For me, it was a chance to experience firsthand the thought processes involved in defending a flawed proposition, necessary practice for recognizing other such flawed beliefs I might be holding; I had no religious upbringing to escape, so that common reference point is missing.
Furthermore, I knew from the outset that such a survey wouldn’t be practical, but I’ve been suspicious of CEV for a while now. It seems like it would be too hard to formalize, and at the same time, even if successful, too far removed from what people spend most of their time caring about. I couldn’t be satisfied that there wasn’t a better way to do it until I’d tried to find such a way myself.
It’s polite to give some signal that you’re playing devil’s advocate if you know you’re making weak arguments.
This is not a sufficient condition for establishing the optimality of CEV. Indeed, I’m not sure there isn’t a better way (nor even that CEV is workable), just that I have at present no candidates for one.
I apologize. I thought I had discharged the devil’s-advocacy-signaling obligation by ending my original post on the subject with a request to be proved wrong.
I agree that personal satisfaction with CEV isn’t a sufficient condition for it being safe. For that matter, having proposed and briefly defended this one alternative isn’t really sufficient for my personal satisfaction in either CEV’s adequacy or the lack of a better option. But we have to start somewhere, and if someone did come up with a better alternative to CEV, I’d want to make sure that it got fair consideration.