Out of curiosity, is there anywhere you’ve written about your object-level view on this? The EY post fleshes out what I would call strong consensus on LW thoroughly, is there some equivalence of this that you’ve put together?
nicholashalden
I disagree with the idea that true things necessarily have explanations that are both convincing and short.
I don’t think it’s necessary for something to be true (there’s no short, convincing explanation of eg quantum mechanics), but I think accurate forecasts tend to have such explanations (Tetlock’s work strongly argues for this).
I agree there is a balance to be struck between losing your audience and being exhaustive, just that the vast majority of material I’ve read is on one side of this.
On that point, have you seen any of my videos, and do you have thoughts on them? You can search “AI Safety” on YouTube.
I don’t prefer video format for learning in general, but I will take a look!
Similarly, do you thoughts on AISafety.info ?
I hadn’t seen this. I think it’s a good resource as sort of a FAQ, but isn’t zeroed in on “here is the problem we are trying to solve, and here’s why you should care about it” in layman’s terms. I guess the best example of what I’m looking for is Benjamin Hilton’s article for 80,000 hours, which I wish were a more popular share.
This seems to violate common sense. Why would you think about this in log space? 99% and 1% are identical in if(>0) space, but they have massively different implications for how you think about a risk (just like 20 and 70% do!)
It’s very strange to me that there isn’t a central, accessible “101” version of the argument given how much has been written.
I don’t think anyone should make false claims, and this is an uncharitable mischaracterization of what I wrote. I am telling you that, from the outside view, what LW/rationalism gets attention for is the “I am sure we are all going to die”, which I don’t think is a claim most of its members hold, and this repels the average person because it violates common sense.
The object level responses you gave are so minimal and dismissive that I think they highlight the problem. “You’re missing the point, no one thinks that anymore.” Responses like this turn discussion into an inside-view only affair. Your status as a LW admin sharpens this point.
Thanks for your reply. I welcome an object-level discussion, and appreciate people reading my thoughts and showing me where they think I went wrong.
The hidden complexity of wishes stuff is not persuasive to me in the context of an argument that AI will literally kill everyone. If we wish for it not to, there might be some problems with the outcome, but it won’t kill everyone. In terms of Bay Area Lab 9324 doing something stupid, I think by the time thousands of labs are doing this, if we have been able to successfully wish for stuff without catastrophe being triggered, it will be relatively easy to wish for universal controls on the wishing technology.
“Infinite number of possible mesa-optimizers”. This feels like just invoking an unknown unknown to me, and then asserting that we’re all going to die, and feels like it’s missing some steps.
You’re wrong about Eliezer’s assertions about hacking, he 100% does believe by dint of a VR headset. I quote: “—Hack a human brain—in the sense of getting the human to carry out any desired course of action, say—given a full neural wiring diagram of that human brain, and full A/V I/O with the human (eg high-resolution VR headset), unsupervised and unimpeded, over the course of a day: DEFINITE YES—Hack a human, given a week of video footage of the human in its natural environment; plus an hour of A/V exposure with the human, unsupervised and unimpeded: YES ”
I get the analogy of all roads leading to doom, but it’s just very obviously not like that, because it depends on complex systems that are very hard to understand, and AI x-risk proponents are some of the biggest advocates of that opacity.
Thank you for the reply. I agree we should try and avoid AI taking over the world.
On “doom through normal means”—I just think there are very plausibly limits to what superintelligence can do. “Persuasion, hacking, and warfare” (appreciate this is not a full version of the argument) don’t seem like doom to me. I don’t believe something can persuade generals to go to war in a short period of time, just because it’s very intelligent. Reminds me of this.
On values—I think there’s a conflation between us having ambitious goals, and whatever is actually being optimized by the AI. I am curious to hear what the “galaxy brained reasons” are; my impression was, they are what was outlined (and addressed) in the original post.
It strikes me that you’re wearing a lot of risk beyond the face value bet. Even if we assume everyone is acting in good faith, there’s likely credit risk across 10 different people promising a $100k+ payout (because most people don’t have that much cash, and even among those who do, there’s some likelihood of falling below that level of liquidity after a 5 year period). On your side, it looks like you’re just sending people your side of the bet before resolution, so they wear zero credit risk, even though the credit risk on your end was smaller to begin with, because coming up with 5k is much more plausible (again, assuming good faith actors). There’s also a time-value of money issue, where you paying now makes your side more valuable. To take the bet you made with Eliezer, you gave him $1k today, and he promised $150k if you pay out; that’s actually about 117k in current dollars with 5-year risk-free interest rates around 5%. It might be better to bet in net-present-value terms.