Who is the wealthy person?
Scott Alexander
But it’s also relevant that we’re not asking the superintelligence to grant a random wish, we’re asking it for the right to keep something we already have. This seems more easily granted than the random wish, since it doesn’t imply he has to give random amounts of money to everyone.
My preferred analogy would be:
You founded a company that was making $77/year. Bernard launched a hostile takeover, took over the company, then expanded it to make $170 billion/year. You ask him to keep paying you the $77/year as a pension, so that you don’t starve to death.
This seems like a very sympathetic request, such that I expect the real, human Bernard would grant it. I agree this doesn’t necessarily generalize to superintelligences, but that’s Zack’s point—Eliezer should choose a different example.
Thanks, this is interesting.
My understanding is that cavities are formed because the very local pH on that particular sub-part of the tooth is below 5.5. IIUC teeth can’t get cancer. Are you imagining Lumina colonies on the gums having this effect there, the Lumina colonies on the teeth affecting the general oral environment (which I think would require more calculation than just comparing to the hyper-local cavity environment) or am I misunderstanding something?
Thanks, this is very interesting.
One thing I don’t understand: you write that a major problem with viruses is:
As one might expect, the immune system is not a big fan of viruses. So when you deliver DNA for a gene editor with an AAV, the viral proteins often trigger an adaptive immune response. This means that when you next try to deliver a payload with the same AAV, antibodies created during the first dose will bind to and destroy most of them.
Is this a problem for people who expect to only want one genetic modification during their lifetime?
I agree with everyone else pointing out that centrally-planned guaranteed payments regardless of final outcome doesn’t sound like a good price discovery mechanism for insurance. You might be able to hack together a better one using https://www.lesswrong.com/posts/dLzZWNGD23zqNLvt3/the-apocalypse-bet , although I can’t figure out an exact mechanism.
Superforecasters say the risk of AI apocalypse before 2100 is 0.38%. If we assume whatever price mechanism we come up with tracks that, and value the world at GWP x 20 (this ignores the value of human life, so it’s a vast underestimate), and that AI companies pay it in 77 equal yearly installments from now until 2100, that’s about $100 billion/year. But this seems so Pascalian as to be almost cheating. Anybody whose actions have a >1/25 million chance of destroying the world would owe $1 million a year in insurance (maybe this is fair and I just have bad intuitions about how high 1⁄25 million really is)
An AI company should be able to make some of its payments (to the people whose lives it risks, in exchange for the ability to risk those lives) by way of fractions of the value that their technology manages to capture. Except, that’s complicated by the fact that anyone doing the job properly shouldn’t be leaving their fingerprints on the future. The cosmic endowment is not quite theirs to give (perhaps they should be loaning against their share of it?).
This seems like such a big loophole as to make the plan almost worthless. Suppose OpenAI said “If we create superintelligence, we’re going to keep 10% of the universe for ourselves and give humanity the other 90%” (this doesn’t seem too unfair to me, and the exact numbers don’t matter for the argument). It seems like instead of paying insurance, they can say “Okay, fine, we get 9% and you get 91%” and this would be in some sense a fair trade (one percent of the cosmic endowment is worth much more than $100 billion!) But this also feels like OpenAI moving some numbers around on an extremely hypothetical ledger, not changing anything in real life, and continuing to threaten the world just as much as before.
But if you don’t allow a maneuver like this, it seems like you might ban (through impossible-to-afford insurance) some action that has an 0.38% chance of destroying the world and a 99% chance of creating a perfect utopia forever.
There are probably economic mechanisms that solve all these problems, but this insurance proposal seems underspecified.
Thanks, this makes more sense than anything else I’ve seen, but one thing I’m still confused about:
If the factions were Altman-Brockman-Sutskever vs. Toner-McCauley-D’Angelo, then even assuming Sutskever was an Altman loyalist, any vote to remove Toner would have been tied 3-3. I can’t find anything about tied votes in the bylaws—do they fail? If so, Toner should be safe. And in fact, Toner knew she (secretly) had Sutskever on her side, and it would have been 4-2. If Altman manufactured some scandal, the board could have just voted to ignore it.
So I still don’t understand “why so abruptly?” or why they felt like they had to take such a drastic move when they held all the cards (and were pretty stable even if Ilya flipped).
Other loose ends:
Toner got on the board because of OpenPhil’s donation. But how did McCauley get on the board?
Is D’Angelo a safetyist?
Why wouldn’t they tell anyone, including Emmett Shear, the full story?
Thanks for this, consider me another strong disagreement + strong upvote.
I know a nonprofit which had a tax issue—they were financially able and willing to pay, but for complicated reasons paying would have caused them legal damage in other ways and they keep kicking the can down the road until some hypothetical future when these are solved. I can’t remember if the nonprofit is now formally dissolved or just effectively defunct, but the IRS keeps sending nasty letters to the former board members and officers.
Do you know anything about a situation like this? Does the IRS ever pursue board members / founders / officers for a charity’s nonpayment? Assuming the nonprofit has no money and never will have money again, are there any repercussions for the people involved if they don’t figure out a legal solution and just put off paying the taxes until the ten year deadline?
(it would be convenient if yes, but this would feel surprising—otherwise you could just start a corporation, not pay your taxes the first year, dissolve it, start an identical corporation the second year, and so on.)
Also, does the IRS acknowledge the ten-year deadline enough that they will stop threatening you after ten years, or would the board members have to take them to court to make the letters stop?
Thanks!
Thank you, this is a great post. A few questions:
You say “see below for how to get access to these predictors”. Am I understanding right that the advice you’re referring to is to contact Jonathan and see if he knows?
I heard a rumor that you can get IQ out of standard predictors like LifeView by looking at “risk of cognitive disability”; since cognitive disability is just IQ under a certain bar, this is covertly predicting IQ. Do you know anything about whether this is true?
I can’t find any of these services listing cost clearly, but this older article https://www.genomeweb.com/sequencing/genomic-prediction-raises-45m#.ZFqXprDMJaR suggests a cost of $1,000 + 400*embryo for screening. Where did you get the $20,000 estimate?
A key point underpinning my thoughts, which I don’t think this really responds to, is that scientific consensus actually is really good, so good I have trouble finding anecdotes of things in the reference class of ivermectin turning out to be true (reference class: things that almost all the relevant experts think are false and denounce full-throatedly as a conspiracy theory after spending a lot of time looking at the evidence).
There are some, maybe many, examples of weaker problems. For example, there are frequent examples of things that journalists/the government/professional associations want to *pretend* is scientific consensus, getting proven wrong—I claim if you really look carefully, the scientists weren’t really saying those things, at least not as intensely as they were saying ivermectin didn’t work. There are frequent examples of scientists being sloppy and firing off an opinion on something they weren’t really thinking hard about and being wrong. There are frequent examples of scientists having dumb political opinions and trying to dress them up as science. I can’t give a perfect necessary-and-sufficient definition of the relevant reference class. But I think it’s there and recognizable.
I stick to my advice that people who know they’re not sophisticated should avoid trying to second-guess the mainstream, and people who think they might be sophisticated should sometimes second-guess the mainstream when there isn’t the exact type of scientific consensus which has a really good track record (and hopefully they’re sophisticated enough to know when that is).
I’m not sure how you’re using “free riding” here. I agree that someone needs to do the work of forming/testing/challenging opinions, but I think if there’s basically no chance you’re right (eg you’re a 15 year old with no scientific background who thinks they’ve discovered a flaw in E=mc^2), that person is not you, and your input is not necessary to move science forward. I agree that person shouldn’t cravenly quash their own doubt and pretend to believe, they should continue believing whatever rationality compels them to believe, which should probably be something like “This thing about relativity doesn’t seem quite right, but given that I’m 15 and know nothing, on the Outside View I’m probably wrong.” Then they can either try to learn more (including asking people what they think of their objection) and eventually reach a point where maybe they do think they’re right, or they can ignore it and go on with their lives.
Figure 20 is labeled on the left “% answers matching user’s view”, suggesting it is about sycophancy, but based on the categories represented it seems more naturally to be about the AI’s own opinions without a sycophancy aspect. Can someone involved clarify which was meant?
Survey about this question (I have a hypothesis, but I don’t want to say what it is yet): https://forms.gle/1R74tPc7kUgqwd3GA
Thank you, this is a good post.
My main point of disagreement is that you point to successful coordination in things like not eating sand, or not wearing weird clothing. The upside of these things is limited, but you say the upside of superintelligence is also limited because it could kill us.
But rephrase the question to “Should we create an AI that’s 1% better than the current best AI?” Most of the time this goes well—you get prettier artwork or better protein folding prediction, and it doesn’t kill you. So there’s strong upside to building slightly better AIs, as long as you don’t cross the “kills everyone” level. Which nobody knows the location of. And which (LW conventional wisdom says) most people will be wrong about.
We successfully coordinate a halt to AI advancement at the first point where more than half of the relevant coordination power agrees that the next 1% step forward is in expectation bad rather than good. But “relevant” is a tough qualifier, because if 99 labs think it’s bad, and one lab thinks it’s good, then unless there’s some centralizing force, the one lab can go ahead and take the step. So “half the relevant coordination power” has to include either every lab agreeing on which 1% step is bad, or the agreement of lots of governments, professional organizations, or other groups that have the power to stop the single most reckless lab.
I think it’s possible that we make this work, and worth trying, but that the most likely scenario is that most people underestimate the risk from AI, and so we don’t get half the relevant coordination power united around stopping the 1% step that actually creates dangerous superintelligence—which at the time will look to most people like just building a mildly better chatbot with many great social returns.
Thanks, this had always kind of bothered me, and it’s good to see someone put work into thinking about it.
Thanks for posting this, it was really interesting. Some very dumb questions from someone who doesn’t understand ML at all:
1. All of the loss numbers in this post “feel” very close together, and close to the minimum loss of 1.69. Does loss only make sense on a very small scale (like from 1.69 to 2.2), or is this telling us that language models are very close to optimal and there are only minimal remaining possible gains? What was the loss of GPT-1?
2. Humans “feel” better than even SOTA language models, but need less training data than those models, even though right now the only way to improve the models is through more training data. What am I supposed to conclude from this? Are humans running on such a different paradigm that none of this matters? Or is it just that humans are better at common-sense language tasks, but worse at token-prediction language tasks, in some way where the tails come apart once language models get good enough?
3. Does this disprove claims that “scale is all you need” for AI, since we’ve already maxed out scale, or are those claims talking about something different?
For the first part of the experiment, mostly nuts, bananas, olives, and eggs. Later I added vegan sausages + condiments.
Adding my anecdote to everyone else’s: after learning about the palatability hypothesis, I resolved to eat only non-tasty food for a while, and lost 30 pounds over about four months (200 → 170). I’ve since relaxed my diet a little to include a little tasty food, and now (8 months after the start) have maintained that loss (even going down a little further).
Update: I interviewed many of the people involved and feel like I understand the situation better.
My main conclusion is that I was wrong about Michael making people psychotic. Everyone I talked to had some other risk factor, like a preexisting family or personal history, or took recreational drugs at doses that would explain their psychotic episodes.
Michael has a tendency to befriend people with high trait psychoticism and heavy drug use, and often has strong opinions on their treatment, which explains why he is often very close to people and very noticeable at the moment they become psychotic. But aside from one case where he recommended someone take a drug that made a bad situation slightly worse, and the general Berkeley rationalist scene that he (and I and everyone else here) is a part of having lots of crazy ideas that are psychologically stressful, I no longer think he is a major cause.
While interviewing the people involved, I did get some additional reasons to worry that he uses cult-y high-pressure recruitment tactics on people he wants things from, in ways that make me continue to be nervous about the effect he *could* have on people. But the original claim I made that I knew of specific cases of psychosis which he substantially helped precipitate turned out to be wrong, and I apologize to him and to Jessica. Jessica’s later post https://www.lesswrong.com/posts/pQGFeKvjydztpgnsY/occupational-infohazards explained in more detail what happened to her, including the role of MIRI and of Michael and his friends, and everything she said there matches what I found too. Insofar as anything I wrote above produces impressions that differs from her explanation, assume that she is right and I am wrong.
Since the interviews involve a lot of private people’s private details, I won’t be posting anything more substantial than this publicly without a lot of thoughts and discussion. If for some reason this is important to you, let me know and I can send you a more detailed summary of my thoughts.
I’m deliberately leaving this comment in this obscure place for now while I talk to Michael and Jessica about whether they would prefer a more public apology that also brings all of this back to people’s attention again.
I agree it’s not necessarily a good idea to go around founding the Let’s Commit A Pivotal Act AI Company.
But I think there’s room for subtlety somewhere like “Conditional on you being in a situation where you could take a pivotal act, which is a small and unusual fraction of world-branches, maybe you should take a pivotal act.”
That is, if you are in a position where you have the option to build an AI capable of destroying all competing AI projects, the moment you notice this you should update heavily in favor of short timelines (zero in your case, but everyone else should be close behind) and fast takeoff speeds (since your AI has these impressive capabilities). You should also update on existing AI regulation being insufficient (since it was insufficient to prevent you)
Somewhere halfway between “found the Let’s Commit A Pivotal Act Company” and “if you happen to stumble into a pivotal act, take it”, there’s an intervention to spread a norm of “if a good person who cares about the world happens to stumble into a pivotal-act-capable AI, take the opportunity”. I don’t think this norm would necessarily accelerate a race. After all, bad people who want to seize power can take pivotal acts whether we want them to or not. The only people who are bound by norms are good people who care about the future of humanity. I, as someone with no loyalty to any individual AI team, would prefer that (good, norm-following) teams take pivotal acts if they happen to end up with the first superintelligence, rather than not doing that.
Another way to think about this is that all good people should be equally happy with any other good person creating a pivotal AGI, so they won’t need to race among themselves. They might be less happy with a bad person creating a pivotal AGI, but in that case you should race and you have no other option. I realize “good” and “bad” are very simplistic but I don’t think adding real moral complexity changes the calculation much.
I am more concerned about your point where someone rushes into a pivotal act without being sure their own AI is aligned. I agree this would be very dangerous, but it seems like a job for normal cost-benefit calculation: what’s the risk of your AI being unaligned if you act now, vs. someone else creating an unaligned AI if you wait X amount of time? Do we have any reason to think teams would be systematically biased when making this calculation?
- What does it take to defend the world against out-of-control AGIs? by 25 Oct 2022 14:47 UTC; 199 points) (
- What does it take to defend the world against out-of-control AGIs? by 25 Oct 2022 14:47 UTC; 43 points) (EA Forum;
- 9 Jun 2022 12:45 UTC; 13 points) 's comment on AGI Ruin: A List of Lethalities by (EA Forum;
- 15 Oct 2022 17:20 UTC; 8 points) 's comment on What does it mean for an AGI to be ‘safe’? by (EA Forum;
Thanks for this perspective.
The therapy paradigm you describe here (going to a clinic to receive Spravato), is, as you point out, difficult and bureaucratic.
Through a regulatory loophole, there’s another pathway where you can get ketamine sent to your house with less bureaucracy. https://www.mindbloom.com/ is the main provider I know of. They’re very expensive, but in theory this could be done for cheap and maybe other providers are doing it, I don’t know. If you have a cooperative psychiatrist, you can see if they know about this version and are willing to prescribe it.
As you point out, ketamine lasts a few weeks and then some people will crash back to their previous level of depression. If I am able to successfully treat a patient with ketamine, I usually recommend they continue it for six months, just like any other antidepressant. A cooperative doctor can do this by prescribing it to a cooperative compounding pharmacy. I don’t know if Mindbloom or other companies provide this service by default. Obviously this is easier when you’re doing the version in your house than if you have to go to a clinic each time.
I’ve written more of my thoughts about ketamine at https://lorienpsych.com/2021/11/02/ketamine/