An exercise for those who are a little more advanced: *actually* save the world within the next 80 days. In the context of AI safety at the singularity level, that would mean completely figuring out the theory required to have a friendly singularity, and then making it happen for real, by the end of July.
I downvoted the comment. We want to have a culture where people who chose to do good ambitious projected to get social approval instead of being told: “What you want to do isn’t very advanced, advanced people should do X”.
The fact that I was basically serious, and in no way attempting to discourage @elriggs, and yet the comment is (after 12 hours) at −17, suggests that LW now has a problem with people who *do* want to do advanced things.
I don’t think it should be at −17, but I don’t think what its low score indicates is that LW has a problem with people who want to do advanced things.
Your suggestion, taken as a serious one, is obviously absurdly overambitious. Its premise is that the experts in the relevant fields have nearly enough knowledge to (1) create a superhuman AI and (2) arrange for it to behave in ways that are good rather than bad for us. But so far as I can tell, those same experts are pretty much universally agreed that those are super-hard problems. (There are some people who are arguably experts on #1 and think #2 might be easy. There are some people who are arguably experts on #2 and think #1 might happen much sooner than we’d guess. But I don’t think I know of anyone who’s an expert on #1 and thinks #1 is feasible within, say, a year, or of anyone who’s an expert on #2 and thinks #2 is feasible on that timescale.)
So you are simultaneously arguing that the state of the art in these things is really far advanced and that the people whose work makes it so are hopelessly incompetent to evaluate how close we are.
For sure, it could turn out that you’re right. But it seems staggeringly unlikely, and in any case anyone actually in a position to solve those problems within 80 days is surely already working on them.
Also: If I try to think about the possible worlds most similar to the one I think we’re actually in in which at least one of those problems does get solved within 80 days, it seems to me that a substantial fraction are ones in which just one of them does, and the most likely way for that to happen is some sort of rapidly recursively self-improving AI (“FOOM”), and if that happens without the other problem getting solved there’s a substantial danger that we’re all screwed. In that possible world, advising people to rush to solve those problems seems like rather a bad idea.
(I don’t think FOOM+doom is a terribly likely outcome. But I think it’s quite likely conditional on any part of your proposal turning out to be feasible.)
The problem isn’t conscious intent but the social effect of a statement. Being bad at social skills and thus lacking awareness of making status moves is no good justification for them being proper.
If your position is to seriously propose that trial, there’s no good reason to do so in this thread but you could have written your own post for it.
What’s the point of the comment above? I mean, presumably you don’t actually think that’s a sensible goal for anyone to have, because no one could think that, so I guess the purpose is mockery or something of the kind—but if so, presumably there’s some point beyond mere mean-spiritedness, some actual implicit argument that you’re making, and I’m not seeing it.
(Perhaps I’m completely misunderstanding and you actually do think it would make sense to take “completely solve the problem of creating superhuman AI, along with the problem of having it not destroy everything we care about, in 80 days” as a goal. In that case, please accept my apologies and feel free to explain why you think such an extraordinary thing.)
I do mean it rather seriously. There are theoretical frameworks already in play, directed at solving the two major subproblems that you identify, i.e. creating raw superintelligence, and (let’s say) identifying a friendly value system. I actually find it conceivable that the required breakthroughs are not far away, in the same way that e.g. imminent solutions to math’s Millennium Problems are conceivable—someone just needs to have the right insights.
Disclaimer: it’s hard to see too many levels above your own. This may still be underestimating the difficulty of friendliness.
I’d like to note that I’m skeptical of this for a few reasons:
1) this doesn’t even come close to making sense if you aren’t already as knowledgeable as one of the MIRI staff, in which case an 80-day resolve cycle to this effect might actually produce something. I think it is very unlikely that the entire problem can be resolved without an insane amount of intuition (both mathematical and normal). Even if a promising solution was found in that time frame, however,
2) there’s insufficient time for verification. Proving your intuitively-appealing idea will be safe is different than making strong arguments it will.
someone just needs to have the right insights.
Saying “there may be a short inferential distance to the full solution” (as if step-count is the main quantifier of difficulty / time required) misses the difficulty that some of these steps may entail. Yes, the challenge may be different than that posed to a rationalist in 1300 who realizes the horror of death and wants to live. Even if he had complete knowledge of what steps to take, it would be incredibly difficult (and probably impossible) for him to single-handedly build the machinery required to advance science beyond today’s frontier so as to ensure he continued living. That’s for the case in which he spends every waking moment taking an optimal action (with respect to a scientific gradient even more advanced than today’s). Yes, in that situation, there are too many insights and actions requiredfor him to survive.
In that sense, yes—the problem is, perhaps, literally able to be solved by one person’s efforts, but I still don’t think it’s a reasonable challenge. Maybe if you’re already on or near Eliezer’s level, this extended resolve cycle would be useful for generating new ideas.
For what it’s worth, I didn’t downvote it. I wasn’t sure enough of what it’s trying to achieve to be able to tell whether it succeeded or failed, or whether I approve or disapprove :-).
Agreed on the “not downvoting any more than it is right now (-2)”. Though I would still like to dissuade any comments not directly related to the content of the post!
I initially interpreted Mitchell’s as mocking as well, but on a second...third read I interpreted it as:
A reference to a common text book theme “An exercise for the reader” combined with the title of this post. Meant as a funny-joke-but-would-also-be-really-cool-if-someone-actually-did-it. This is just speculation though!! (Is this 100% correct Mitchell?)
I greatly appreciate you standing up for me though!!
If my speculation is correct, then I think the reason both you and I originally interpreted it as mockery would be the “those who are a little more advanced” part (meant as hyperbole) and the “*actually*” part.
An exercise for those who are a little more advanced: *actually* save the world within the next 80 days. In the context of AI safety at the singularity level, that would mean completely figuring out the theory required to have a friendly singularity, and then making it happen for real, by the end of July.
I downvoted the comment. We want to have a culture where people who chose to do good ambitious projected to get social approval instead of being told: “What you want to do isn’t very advanced, advanced people should do X”.
The fact that I was basically serious, and in no way attempting to discourage @elriggs, and yet the comment is (after 12 hours) at −17, suggests that LW now has a problem with people who *do* want to do advanced things.
I don’t think it should be at −17, but I don’t think what its low score indicates is that LW has a problem with people who want to do advanced things.
Your suggestion, taken as a serious one, is obviously absurdly overambitious. Its premise is that the experts in the relevant fields have nearly enough knowledge to (1) create a superhuman AI and (2) arrange for it to behave in ways that are good rather than bad for us. But so far as I can tell, those same experts are pretty much universally agreed that those are super-hard problems. (There are some people who are arguably experts on #1 and think #2 might be easy. There are some people who are arguably experts on #2 and think #1 might happen much sooner than we’d guess. But I don’t think I know of anyone who’s an expert on #1 and thinks #1 is feasible within, say, a year, or of anyone who’s an expert on #2 and thinks #2 is feasible on that timescale.)
So you are simultaneously arguing that the state of the art in these things is really far advanced and that the people whose work makes it so are hopelessly incompetent to evaluate how close we are.
For sure, it could turn out that you’re right. But it seems staggeringly unlikely, and in any case anyone actually in a position to solve those problems within 80 days is surely already working on them.
Also: If I try to think about the possible worlds most similar to the one I think we’re actually in in which at least one of those problems does get solved within 80 days, it seems to me that a substantial fraction are ones in which just one of them does, and the most likely way for that to happen is some sort of rapidly recursively self-improving AI (“FOOM”), and if that happens without the other problem getting solved there’s a substantial danger that we’re all screwed. In that possible world, advising people to rush to solve those problems seems like rather a bad idea.
(I don’t think FOOM+doom is a terribly likely outcome. But I think it’s quite likely conditional on any part of your proposal turning out to be feasible.)
Just a reminder that karma works slightly differently on LW 2.0, so karma −17 today means less than karma −17 would have meant on LW 1.0.
The problem isn’t conscious intent but the social effect of a statement. Being bad at social skills and thus lacking awareness of making status moves is no good justification for them being proper.
If your position is to seriously propose that trial, there’s no good reason to do so in this thread but you could have written your own post for it.
What’s the point of the comment above? I mean, presumably you don’t actually think that’s a sensible goal for anyone to have, because no one could think that, so I guess the purpose is mockery or something of the kind—but if so, presumably there’s some point beyond mere mean-spiritedness, some actual implicit argument that you’re making, and I’m not seeing it.
(Perhaps I’m completely misunderstanding and you actually do think it would make sense to take “completely solve the problem of creating superhuman AI, along with the problem of having it not destroy everything we care about, in 80 days” as a goal. In that case, please accept my apologies and feel free to explain why you think such an extraordinary thing.)
I do mean it rather seriously. There are theoretical frameworks already in play, directed at solving the two major subproblems that you identify, i.e. creating raw superintelligence, and (let’s say) identifying a friendly value system. I actually find it conceivable that the required breakthroughs are not far away, in the same way that e.g. imminent solutions to math’s Millennium Problems are conceivable—someone just needs to have the right insights.
Disclaimer: it’s hard to see too many levels above your own. This may still be underestimating the difficulty of friendliness.
I’d like to note that I’m skeptical of this for a few reasons:
1) this doesn’t even come close to making sense if you aren’t already as knowledgeable as one of the MIRI staff, in which case an 80-day resolve cycle to this effect might actually produce something. I think it is very unlikely that the entire problem can be resolved without an insane amount of intuition (both mathematical and normal). Even if a promising solution was found in that time frame, however,
2) there’s insufficient time for verification. Proving your intuitively-appealing idea will be safe is different than making strong arguments it will.
Saying “there may be a short inferential distance to the full solution” (as if step-count is the main quantifier of difficulty / time required) misses the difficulty that some of these steps may entail. Yes, the challenge may be different than that posed to a rationalist in 1300 who realizes the horror of death and wants to live. Even if he had complete knowledge of what steps to take, it would be incredibly difficult (and probably impossible) for him to single-handedly build the machinery required to advance science beyond today’s frontier so as to ensure he continued living. That’s for the case in which he spends every waking moment taking an optimal action (with respect to a scientific gradient even more advanced than today’s). Yes, in that situation, there are too many insights and actions required for him to survive.
In that sense, yes—the problem is, perhaps, literally able to be solved by one person’s efforts, but I still don’t think it’s a reasonable challenge. Maybe if you’re already on or near Eliezer’s level, this extended resolve cycle would be useful for generating new ideas.
It’s probably just a lame joke that didn’t land. I wouldn’t downvote it below “a little bit negative” which is where it currently stands.
For what it’s worth, I didn’t downvote it. I wasn’t sure enough of what it’s trying to achieve to be able to tell whether it succeeded or failed, or whether I approve or disapprove :-).
Agreed on the “not downvoting any more than it is right now (-2)”. Though I would still like to dissuade any comments not directly related to the content of the post!
I initially interpreted Mitchell’s as mocking as well, but on a second...third read I interpreted it as:
A reference to a common text book theme “An exercise for the reader” combined with the title of this post. Meant as a funny-joke-but-would-also-be-really-cool-if-someone-actually-did-it. This is just speculation though!! (Is this 100% correct Mitchell?)
I greatly appreciate you standing up for me though!!
If my speculation is correct, then I think the reason both you and I originally interpreted it as mockery would be the “those who are a little more advanced” part (meant as hyperbole) and the “*actually*” part.