Epistemic status: the stories here are all as true as possible from memory, but my memory is so so.
This is going to be big
It’s late Summer 2017. I am on a walk in the Mendip Hills. It’s warm and sunny and the air feels fresh. With me are around 20 other people from the Effective Altruism London community. We’ve travelled west for a retreat to discuss how to help others more effectively with our donations and careers. As we cross cow field after cow field, I get talking to one of the people from the group I don’t know yet. He seems smart, and cheerful. He tells me that he is an AI researcher at Google DeepMind. He explains how he is thinking about how to make sure that any powerful AI system actually does what we want it to. I ask him if we are going to build artificial intelligence that can do anything that a human can do. “Yes, and soon,” he says, “And it will be the most important thing that humanity has ever done.”
I find this surprising. It would be very weird if humanity was on the cusp of the most important world changing invention ever, and so few people were seriously talking about it. I don’t really believe him.
This is going to be bad
It is mid-Summer 2018 and I am cycling around Richmond Park in South West London. It’s very hot and I am a little concerned that I am sweating off all my sun cream.
After having many other surprising conversations about AI, like the one I had in the Mendips, I have decided to read more about it. I am listening to an audiobook of Superintelligence by Nick Bostrom. As I cycle in loops around the park, I listen to Bostrom describe a world in which we have created superintelligent AI. He seems to think the risk that this will go wrong is very high. He explains how scarily counterintuitive the power of an entity that is vastly more intelligent than a human is. He talks about the concept of “orthogonality”; the idea that there is no intrinsic reason that the intelligence of a system is related to its motivation to do things we want (e.g. not kill us). He talks about how power-seeking is useful for a very wide range of possible goals. He also talks through a long list of ways we might try to avoid it going very wrong. He then spends a lot of time describing why many of these ideas won’t work. I wonder if this is all true. It sounds like science fiction, so while I notice some vague discomfort with the ideas, I don’t feel that concerned. I am still sweating, and am quite worried about getting sunburnt.
It’s a long way off though
It’s still Summer 2018 and I am in an Italian restaurant in West London. I am at an event for people working in policy who want to have more impact. I am talking to two other attendees about AI. Bostrom’s arguments have now been swimming around my mind for several weeks. The book’s subtitle is “Paths, Dangers, Strategies” and I have increasingly been feeling the weight of the middle one. The danger feels like a storm. It started as vague clouds on the horizon and is now closing in. I am looking for shelter.
“I just don’t understand how we are going to set policy to manage these things” I explain.
I feel confused and a little frightened.
No-one seems to have any concrete policy ideas. But my friend chimes in to say that while yeah there’s a risk, it’s probably pretty small and far away at this point.
“Experts thinks it’ll take at least 40 more years to get really powerful AI” she explains, “there is plenty of time for us to figure this out”.
I am not totally reassured, but the clouds retreat a little.
This is fine
It is late January 2020 and I am at after-work drinks in a pub in Westminster. I am talking to a few colleagues about the news. One of my colleagues, an accomplished government economist, is reflecting on the latest headlines about a virus in China.
“We are overreacting,” he argues “just as we did with SARS. The government feels like it has to look like it’s doing something, but it’s too much.“
He goes on to talk about how the virus is mostly just media hype. Several other colleagues agree.
I don’t respond to this. I have a half-formed thought along the lines of “even if there is only a small chance that this becomes a big pandemic, it might be worth reacting strongly. The cost of underreacting is much greater than the cost of overreacting”. But I don’t say this. I am a little bit reassured by his confidence that all will be well.
A few weeks later, I come home to see my housemate carrying crates of canned food to his room. He has spent a lot of time on internet rationality forums and at their suggestion has started stockpiling. His cupboard is filled with huge quantities of soup. I think that this is pretty ridiculous. My other housemates and I watch him as he lugs the crates up the stairs to his room; we look at each other and roll our eyes.
It’s probably something else in your life
It’s Spring 2021. I am at a dinner hosted by a friend in south London. I am talking to an acquaintance, a PhD student studying machine learning at a London university. I ask him how he has been doing.
“Not very good to be honest,” he says glumly. “I am pretty worried about AI and no one in my lab seems to take this seriously. They all think it’s science fiction. It makes me feel really lonely and a bit crazy.”
He looks dejected.
“It’s not just the ideas though. I think this thing is actually going to kill me! It’s horrifying”.
I commiserate with him. It must be hard to be surrounded by people who dismiss your ideas all day. I wonder if his anxiety has more to do with lock-down-exacerbated loneliness than with the whole AI apocalypse thing.
A year later we will see each other at another party and he will tell me that he has now quit the PhD and is feeling much better. He will say that he still thinks it’s likely that AI will kill him.
No-one in my org puts money in their pension
It’s Summer 2022, I am at a party in central London. I am talking to an AI alignment researcher in the kitchen. He is upbeat and is excitedly talking about some ideas that might ensure the alignment of superintelligent AI with human values.
“We just need to create this small AI that we can fully understand and verify is aligned” he explains “and then we use that AI to validate that the super huge massive superintelligence is aligned with our values…. It’s a bit like creating Godzilla to fight Mega-Godzilla!”.
“That is not a reassuring metaphor,” I say,
“Oh yeah, we’re not super hopeful, but we’re trying” he states, somehow still cheerfully.
Later in the conversation, he will explain that despite the government incentive systems, no-one at his place of work takes anything above the minimum possible pension. They all doubt they will need it.
Over the next few weeks, I will wonder if I should be putting less money into my pension.
Doom-vibes
It is February 2023. I now have a vague sense of impending doom most mornings. I’m not sure why. It might be the combination of my pre-existing beliefs about the difficulty of AI alignment and the holy-shit-rapid-advances vibe that OpenAI has been producing. It might be that I have been reading post after post talking about the timelines to transformative AI and the probability of doom. Or all the people talking about how AI-alignment seems really hard and how there is a now a molochian race-to-the-bottom between AI labs to produce the biggest intelligence as quickly as possible.
The doom feeling might also be the mid-winter darkness and cold, alongside my general tendency towards anxious thoughts and feelings. It could also be that I am yet to secure future funding for the project that is my current livelihood. All I know is that I regularly feel hopeless.
Each morning, I go to the window of my East London flat. Many days I look out over the rooftops and picture the swarms of kill-drones on the horizon. Maybe that’s how it will happen? That, or a mysterious and terrifying pandemic? I will see my friends freaking out about it on social media one day and then a day later my partner and I will be coughing up blood. Or maybe it’ll be quicker. I’ll be blinded by the initial flash of the bomb, then a fraction of a second of extreme heat before the end. The fear isn’t sharp, just a dull empty sense that there is no future.
Maths might help
It’s March 2023 and I am in my flat, talking to a friend. We have agreed to meet to spend some time figuring out our thoughts on the actual risk from AI. We have both been reading a lot about it, but still feel very confused so wanted to be more deliberate.
We spend some time together writing about and discussing our thoughts. I am still very confused. I mostly seem to be dancing between two ideas:
On one side there is the idea that the base-rate for catastrophic risk is low. New technology is usually good on balance and no new tech has ever killed humanity before. Good forecasters should need a lot of evidence to update away from that very low prior probability of doom. There isn’t much hard evidence that AI is actually dangerous, and it seems very possible that we just won’t be able to create superintelligence for some reason anyway.
On on the other side is the idea that intelligence creation is just categorically different from other technology. Intelligence is the main tool for gaining power in the world. This makes the potential impact of AI completely historically unprecedented. And if you are able to take the perspective of smaller groups of humans (e.g. many groups of indigenous people), powerful agents unexpectedly causing sudden doom is actually very very precedented. Oh, and the power of these things is growing extremely fast.
“So what do you think?” my friend asks, “What’s your probability of doom?”
“I really don’t know” I sigh, “one part of me is saying this is all going to be fine, things are usually fine, and another part of me is saying that this is definitely going to be terrible.” I pause….
“So maybe like 20%?”
A problem shared is…
It is Easter 2023 and I am at my aunt’s house. We have just finished a large lunch and I am sitting at the dinner table with my parents, aunts, uncles and cousins. Someone asks me what I am working on at the moment. I explain the personal career review I am doing.
“I want my career to be impactful, I want to help others” I explain, “And in terms of the positive impact I could have on the world, I am really worried about the risks from advanced AI”.
My uncle asks me why I’m so worried.
I respond:
“These orgs have many billions of dollars in funding and are building things that are unimaginably powerful, on purpose. If something much smarter than a human can be built, and there doesn’t seem to be a reason why it won’t be, it will be massively powerful. Intelligence is what allowed humanity to become the dominant species on this planet. It’s the reason we exterminate ants, not the other way around. They are building something, that in terms of raw power could be to us as we are to ants”
I go on…
“Lots of people think that this might be enough; that the main risk is that we will build a thing, it will have goals that are different to ours, and then game over. That seems possible and scary. But humans don’t even need to lose control for this to go very badly. Superintelligence is a superpower, and big changes to the global structure of power can be unpredictable and terrifying. So I don’t know what happens next. It might be the worst wars imaginable or AI-powered global totalitarianism. But whatever does happen seems like it has a decent chance of killing us or making life for most or all of humanity terrible”.1
My family shows concern, maybe some confusion, but definitely concern. It feels relieving to express. I have always been stoical about my pain and anxieties. As a child and teenager, I never wanted to bother others with my stuff. It’s nice to be able to express to them that I am scared about something. Talking about the risk of AI doom feels easier than discussing my career worries.
Hope
It is June 2023. In the months prior, the heads of the leading AI labs have been talking to world leaders about the existential risks from AI. They, along with many prominent AI researchers and tech leaders have signed a statement saying that “Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war”.
Today I am hosting an event in East London on AI governance. I have lined up 8 speakers from the UK government, think tanks and academia, and around 70 people have turned up to watch. I’m proud of my work organising the event, but am feeling pretty nervous and hoping it goes well.
To inspire discussion, I have created a graph on the wall by the entrance to collect and present attendee’s views on AI. The two axis are “how long it will be until we have Artificial General Intelligence?” and “theprobability of AGI catastrophe”. As people come through the door they add a post-it note to the place on the graph that approximates their viewpoint. Most people have put their post-its in a cluster centered at a 25% chance of catastrophe in around 20 years. I stare at the second largest cluster in the top left of the graph. Several people here think there is around a 75% chance of everyone being killed in around 7 years. I think about the kill-drones. I think about the flash of the bomb. I think about whether my pension is a waste of money. I think about the unreal shock of the first days of lockdown in 2020. I feel a pang of fear in my stomach.
I look around at the crowded room of people. They are snacking on the fruit and hummus I bought and are talking eagerly. I look with some trepidation at the lectern where I will soon be standing in front of everyone to introduce the first speaker. They will be talking positively and insightfully about how we could govern advanced AI and what policies and institutions we need to ensure AI safety. Right now I focus on my breath and try to let go of the fear for a moment. I start to walk towards the front of the room. I really, really, really hope that this goes well.
“No-one in my org puts money in their pension”
Link post
Epistemic status: the stories here are all as true as possible from memory, but my memory is so so.
This is going to be big
It’s late Summer 2017. I am on a walk in the Mendip Hills. It’s warm and sunny and the air feels fresh. With me are around 20 other people from the Effective Altruism London community. We’ve travelled west for a retreat to discuss how to help others more effectively with our donations and careers. As we cross cow field after cow field, I get talking to one of the people from the group I don’t know yet. He seems smart, and cheerful. He tells me that he is an AI researcher at Google DeepMind. He explains how he is thinking about how to make sure that any powerful AI system actually does what we want it to. I ask him if we are going to build artificial intelligence that can do anything that a human can do. “Yes, and soon,” he says, “And it will be the most important thing that humanity has ever done.”
I find this surprising. It would be very weird if humanity was on the cusp of the most important world changing invention ever, and so few people were seriously talking about it. I don’t really believe him.
This is going to be bad
It is mid-Summer 2018 and I am cycling around Richmond Park in South West London. It’s very hot and I am a little concerned that I am sweating off all my sun cream.
After having many other surprising conversations about AI, like the one I had in the Mendips, I have decided to read more about it. I am listening to an audiobook of Superintelligence by Nick Bostrom. As I cycle in loops around the park, I listen to Bostrom describe a world in which we have created superintelligent AI. He seems to think the risk that this will go wrong is very high. He explains how scarily counterintuitive the power of an entity that is vastly more intelligent than a human is. He talks about the concept of “orthogonality”; the idea that there is no intrinsic reason that the intelligence of a system is related to its motivation to do things we want (e.g. not kill us). He talks about how power-seeking is useful for a very wide range of possible goals. He also talks through a long list of ways we might try to avoid it going very wrong. He then spends a lot of time describing why many of these ideas won’t work. I wonder if this is all true. It sounds like science fiction, so while I notice some vague discomfort with the ideas, I don’t feel that concerned. I am still sweating, and am quite worried about getting sunburnt.
It’s a long way off though
It’s still Summer 2018 and I am in an Italian restaurant in West London. I am at an event for people working in policy who want to have more impact. I am talking to two other attendees about AI. Bostrom’s arguments have now been swimming around my mind for several weeks. The book’s subtitle is “Paths, Dangers, Strategies” and I have increasingly been feeling the weight of the middle one. The danger feels like a storm. It started as vague clouds on the horizon and is now closing in. I am looking for shelter.
“I just don’t understand how we are going to set policy to manage these things” I explain.
I feel confused and a little frightened.
No-one seems to have any concrete policy ideas. But my friend chimes in to say that while yeah there’s a risk, it’s probably pretty small and far away at this point.
“Experts thinks it’ll take at least 40 more years to get really powerful AI” she explains, “there is plenty of time for us to figure this out”.
I am not totally reassured, but the clouds retreat a little.
This is fine
It is late January 2020 and I am at after-work drinks in a pub in Westminster. I am talking to a few colleagues about the news. One of my colleagues, an accomplished government economist, is reflecting on the latest headlines about a virus in China.
“We are overreacting,” he argues “just as we did with SARS. The government feels like it has to look like it’s doing something, but it’s too much.“
He goes on to talk about how the virus is mostly just media hype. Several other colleagues agree.
I don’t respond to this. I have a half-formed thought along the lines of “even if there is only a small chance that this becomes a big pandemic, it might be worth reacting strongly. The cost of underreacting is much greater than the cost of overreacting”. But I don’t say this. I am a little bit reassured by his confidence that all will be well.
A few weeks later, I come home to see my housemate carrying crates of canned food to his room. He has spent a lot of time on internet rationality forums and at their suggestion has started stockpiling. His cupboard is filled with huge quantities of soup. I think that this is pretty ridiculous. My other housemates and I watch him as he lugs the crates up the stairs to his room; we look at each other and roll our eyes.
It’s probably something else in your life
It’s Spring 2021. I am at a dinner hosted by a friend in south London. I am talking to an acquaintance, a PhD student studying machine learning at a London university. I ask him how he has been doing.
“Not very good to be honest,” he says glumly. “I am pretty worried about AI and no one in my lab seems to take this seriously. They all think it’s science fiction. It makes me feel really lonely and a bit crazy.”
He looks dejected.
“It’s not just the ideas though. I think this thing is actually going to kill me! It’s horrifying”.
I commiserate with him. It must be hard to be surrounded by people who dismiss your ideas all day. I wonder if his anxiety has more to do with lock-down-exacerbated loneliness than with the whole AI apocalypse thing.
A year later we will see each other at another party and he will tell me that he has now quit the PhD and is feeling much better. He will say that he still thinks it’s likely that AI will kill him.
No-one in my org puts money in their pension
It’s Summer 2022, I am at a party in central London. I am talking to an AI alignment researcher in the kitchen. He is upbeat and is excitedly talking about some ideas that might ensure the alignment of superintelligent AI with human values.
“We just need to create this small AI that we can fully understand and verify is aligned” he explains “and then we use that AI to validate that the super huge massive superintelligence is aligned with our values…. It’s a bit like creating Godzilla to fight Mega-Godzilla!”.
“That is not a reassuring metaphor,” I say,
“Oh yeah, we’re not super hopeful, but we’re trying” he states, somehow still cheerfully.
Later in the conversation, he will explain that despite the government incentive systems, no-one at his place of work takes anything above the minimum possible pension. They all doubt they will need it.
Over the next few weeks, I will wonder if I should be putting less money into my pension.
Doom-vibes
It is February 2023. I now have a vague sense of impending doom most mornings. I’m not sure why. It might be the combination of my pre-existing beliefs about the difficulty of AI alignment and the holy-shit-rapid-advances vibe that OpenAI has been producing. It might be that I have been reading post after post talking about the timelines to transformative AI and the probability of doom. Or all the people talking about how AI-alignment seems really hard and how there is a now a molochian race-to-the-bottom between AI labs to produce the biggest intelligence as quickly as possible.
The doom feeling might also be the mid-winter darkness and cold, alongside my general tendency towards anxious thoughts and feelings. It could also be that I am yet to secure future funding for the project that is my current livelihood. All I know is that I regularly feel hopeless.
Each morning, I go to the window of my East London flat. Many days I look out over the rooftops and picture the swarms of kill-drones on the horizon. Maybe that’s how it will happen? That, or a mysterious and terrifying pandemic? I will see my friends freaking out about it on social media one day and then a day later my partner and I will be coughing up blood. Or maybe it’ll be quicker. I’ll be blinded by the initial flash of the bomb, then a fraction of a second of extreme heat before the end. The fear isn’t sharp, just a dull empty sense that there is no future.
Maths might help
It’s March 2023 and I am in my flat, talking to a friend. We have agreed to meet to spend some time figuring out our thoughts on the actual risk from AI. We have both been reading a lot about it, but still feel very confused so wanted to be more deliberate.
We spend some time together writing about and discussing our thoughts. I am still very confused. I mostly seem to be dancing between two ideas:
On one side there is the idea that the base-rate for catastrophic risk is low. New technology is usually good on balance and no new tech has ever killed humanity before. Good forecasters should need a lot of evidence to update away from that very low prior probability of doom. There isn’t much hard evidence that AI is actually dangerous, and it seems very possible that we just won’t be able to create superintelligence for some reason anyway.
On on the other side is the idea that intelligence creation is just categorically different from other technology. Intelligence is the main tool for gaining power in the world. This makes the potential impact of AI completely historically unprecedented. And if you are able to take the perspective of smaller groups of humans (e.g. many groups of indigenous people), powerful agents unexpectedly causing sudden doom is actually very very precedented. Oh, and the power of these things is growing extremely fast.
“So what do you think?” my friend asks, “What’s your probability of doom?”
“I really don’t know” I sigh, “one part of me is saying this is all going to be fine, things are usually fine, and another part of me is saying that this is definitely going to be terrible.” I pause….
“So maybe like 20%?”
A problem shared is…
It is Easter 2023 and I am at my aunt’s house. We have just finished a large lunch and I am sitting at the dinner table with my parents, aunts, uncles and cousins. Someone asks me what I am working on at the moment. I explain the personal career review I am doing.
“I want my career to be impactful, I want to help others” I explain, “And in terms of the positive impact I could have on the world, I am really worried about the risks from advanced AI”.
My uncle asks me why I’m so worried.
I respond:
“These orgs have many billions of dollars in funding and are building things that are unimaginably powerful, on purpose. If something much smarter than a human can be built, and there doesn’t seem to be a reason why it won’t be, it will be massively powerful. Intelligence is what allowed humanity to become the dominant species on this planet. It’s the reason we exterminate ants, not the other way around. They are building something, that in terms of raw power could be to us as we are to ants”
I go on…
“Lots of people think that this might be enough; that the main risk is that we will build a thing, it will have goals that are different to ours, and then game over. That seems possible and scary. But humans don’t even need to lose control for this to go very badly. Superintelligence is a superpower, and big changes to the global structure of power can be unpredictable and terrifying. So I don’t know what happens next. It might be the worst wars imaginable or AI-powered global totalitarianism. But whatever does happen seems like it has a decent chance of killing us or making life for most or all of humanity terrible”.1
My family shows concern, maybe some confusion, but definitely concern. It feels relieving to express. I have always been stoical about my pain and anxieties. As a child and teenager, I never wanted to bother others with my stuff. It’s nice to be able to express to them that I am scared about something. Talking about the risk of AI doom feels easier than discussing my career worries.
Hope
It is June 2023. In the months prior, the heads of the leading AI labs have been talking to world leaders about the existential risks from AI. They, along with many prominent AI researchers and tech leaders have signed a statement saying that “Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war”.
Today I am hosting an event in East London on AI governance. I have lined up 8 speakers from the UK government, think tanks and academia, and around 70 people have turned up to watch. I’m proud of my work organising the event, but am feeling pretty nervous and hoping it goes well.
To inspire discussion, I have created a graph on the wall by the entrance to collect and present attendee’s views on AI. The two axis are “how long it will be until we have Artificial General Intelligence?” and “the probability of AGI catastrophe”. As people come through the door they add a post-it note to the place on the graph that approximates their viewpoint. Most people have put their post-its in a cluster centered at a 25% chance of catastrophe in around 20 years. I stare at the second largest cluster in the top left of the graph. Several people here think there is around a 75% chance of everyone being killed in around 7 years. I think about the kill-drones. I think about the flash of the bomb. I think about whether my pension is a waste of money. I think about the unreal shock of the first days of lockdown in 2020. I feel a pang of fear in my stomach.
I look around at the crowded room of people. They are snacking on the fruit and hummus I bought and are talking eagerly. I look with some trepidation at the lectern where I will soon be standing in front of everyone to introduce the first speaker. They will be talking positively and insightfully about how we could govern advanced AI and what policies and institutions we need to ensure AI safety. Right now I focus on my breath and try to let go of the fear for a moment. I start to walk towards the front of the room. I really, really, really hope that this goes well.