I think that we could increase the proportion of LWers actually doing something about this via positive social expectation: peer-centric goal-setting and feedback. Positive social expectation (as I’ve taken to calling it) is what happens when you agree to meet a friend at the gym at 5 - you’re much more likely to honor a commitment to a friend than one to yourself. I founded a student group to this effect at my undergrad and am currently collaborating with my university (I highly recommend reading the writeup) to implement it on a larger scale.
Basically, we could have small groups of people checking in once a week for half an hour. Each person briefly summarizes their last week and what they want to do in the next week; others can share their suggestions. Everyone sets at least one habit goal (stop checking email more than once a day) and one performance goal (read x chapters of set theory, perhaps made possible by improved methodology suggested by more-/differently-experienced group members).
I believe that the approach has many advantages over having people self-start:
lowered psychological barrier to getting started on x-risk (all they have to do is join a group; they see other people who aren’t (already) supergeniuses like Eliezer doing work, so they feel better about their own influence)
higher likelihood of avoiding time / understanding sinks (bad / unnecessary textbooks)
increased instrumental rationality
lower likelihood of burnout / negative affect spirals / unsustainable actions being taken
a good way to form friendships in the LW community
robust way to get important advice (not found in the Sequences) to newer people that may not be indexed under the keywords people initially think to search for
The downside is the small weekly time commitment.
I’ll probably make a post on this soon, and perhaps even a sequence on Actually Getting Started (as I’ve been reorganizing my life with great success).
“Make a meetup” is indeed one of my favorite rationalsphere hammers. So I’m sympathetic to this approach. But I gradually experienced a few issues that make me skeptical about this:
1) Social commitment devices are very fragile. In my experience, as soon as one buddy doesn’t show up to the gym once, it rapidly spirals into ineffectiveness. Building habits with an internal locus of control is very important to gaining real habits and skills.
I think the social commitment device can be useful to get started, but I think you should very rapidly try to evolve such that you don’t need it
2) I think x-risk really desperately needs people who already have the “I can self-start on my own” and “I can think usefully for myself” properties.
The problem is that there is little that needs doing that can be done by people who don’t already have those skills. The AI safety field keeps having people show up who say “I want to help”, and then it turns out not to be that easy to help, so those people sort of shrug and go back to their lives.
And the issue is that the the people who are involved do need help, but it requires a lot of context, and giving people context requires a lot of time (i.e. have several lengthy conversations over several weeks, or working together on a project), and then that time is precious.
And if it then turns out that the person they’re basically mentoring doesn’t have the “I can self start, self motivate, and think for myself” properties, then the mentor hasn’t gained an ally—they’ve gained a new obligation to take care of, or spend energy checking in on, or they just wasted their time.
I think a group like you describe could be useful, but there are lot of ways for it to be ineffectual if not done carefully. I may have more thoughts later.
The AI safety field keeps having people show up who say “I want to help”, and then it turns out not to be that easy to help, so those people sort of shrug and go back to their lives.
I think this can be nearly completely solved by using a method detailed in Decisive—expectation-setting. I remember that employers found that warning potential employees of the difficulty and frustration involved with the job, retention skyrocketed. People (mostly) weren’t being discouraged from the process, but having their expectations set properly actually made them not mind the experience.
I think the social commitment device can be useful to get started, but I think you should very rapidly try to evolve such that you don’t need it
I agree. At uni, the idea is that it gets people into a framework where they’re able to get started, even if they aren’t self-starters. Here, one of the main benefits would be that people at various stages of the pipeline could share what worked and what didn’t. For example, knowing that understanding one textbook is way easier if you’ve already learned a prereq is valuable information, and that doesn’t seem to always be trivially-knowable ex ante. The onus is less on the social commitment and more on the “team of people working to learn AI Safety fundamentals”.
I think x-risk really desperately needs people who already have the “I can self-start on my own” and “I can think usefully for myself” properties.
Agreed. I’m not looking to make the filter way easier to pass, but rather to encourage people to keep working. “I can self-start” is necessary, but I don’t think we can expect everyone to be able to self-motivate indefinitely in the face of a large corpus of unfamiliar technical material. Sure, a good self-starter will reboot eventually, but it’s better to have lightweight support structures that maintain a smooth rate of progress.
Additionally, my system 1 intuition is that there are people close to the self-starter threshold who are willing to work on safe AI, and that these people can be transformed into grittier, harder workers with the right structure. Maybe that’s not even worth our time, but it’s important to keep in mind the multiplicative benefits possible from actually getting more people involved. I could also be falling prey to the typical mind fallacy, as I only got serious when my worry overcame my doubts of being able to do anything.
And if it then turns out that the person they’re basically mentoring doesn’t have the “I can self start, self motivate, and think for myself” properties, then the mentor hasn’t gained an ally—they’ve gained a new obligation to take care of, or spend energy checking in on, or they just wasted their time.
Perhaps a more beneficial structure than “one experienced person receives lots of obligations” could be “three pairs of people (all working on learning different areas of the syllabus at any given time) share insights they picked up in previous iterations”. Working in pairs could spike efficiency relative to working alone due to each person making separate mistakes; together, they smooth over rough spots in their learning. I remember this problem being discussed in a post a few years back about how most of the poster’s autodidact problems were due to trivial errors that weren’t easily fixable by someone not familiar with that specific material.
I think that we could increase the proportion of LWers actually doing something about this via public discussions on LW about AI related issues, new developments, new evidence, and posts offering the readers to think about certain challenges (even if they don’t already think the challenges are critical). That makes sense right?
If I look for AI related posts over the last month, I see a few, but a lot of them are meta issues, and in general, I don’t see anything that could force an unconvinced reader to update in either direction.
This is based on the premise that there are many people on LW who are familiar with the basic arguments, and would be able to engage in some meaningful work, but don’t find the arguments all that convincing. Note, you need to convince people not only that AI risk exists (a trivial claim), but that it’s more likely than, e.g. an asteroid impact (honestly, I don’t think I’ve seen an argument in that spirit).
I think that could be a good idea, too. The concern is whether there is substantial meaningful (and non-technical) discussion left to be had. I haven’t been on LW very long, but in my time here it has seemed like most people agree on FAI being super important. This topic seems (to me) to have already been well-discussed, but perhaps that was in part because I was searching out that content.
For many people, the importance hasn’t reached knowing, on a gut level, that unfriendly AI can plausibly, and will probably (in longer timescales, at the very minimum), annihilate everything we care about, to a greater extent than even nuclear war. That is—if we don’t act. That’s a hell of a claim to believe, and it takes even more to be willing to do something.
Yes, it is, and it should take a hell of an argument to make someone believe this. The fact that many people don’t quite believe this (on a gut level, at least), suggests that there are still many arguments to be made (alternatively, it might not be true).
it has seemed like most people agree on FAI being super important.
There are many people here who believe it, who spend a lot of time thinking about it and who also happen to be active users in LW, which might skew your perception of the average user. I suspect that there are many people who find the arguments a little fishy, but don’t quite know what’s wrong with them. At the very least, there is me.
I’m also in this position at the moment, but in part due to this post I now plan to spend significant time (at least 5 days, I.e. 40 hrs, cumulatively) doing deep reflection on timelines this summer, with the goal of making my model detailed enough to make life decisions based on it. (Consider this my public precommitment to do so; I will at minimum post a confirmation of whether or not I have done so on my shortform feed by August 15th, and if it seems useful I may also post a writeup of my reflections).
Suppose the Manhattan project was currently in progress, meaning we somehow had the internet, mobile phones, etc. but not nuclear bombs. You are a smart physicist that keeps up with progress in many areas of physics and at some point you realize the possibility of a nuclear bomb. You also foresee the existential risk this poses.
You manage to convince a small group of people of this, but many people are skeptical and point out the technical hurdles that would need to be overcome, and political decisions that would need to be taken, for the existential risk to become reality. They think it will all blow over and work itself out. And most people fail to grasp enough of the details to have a rational opinion about the topic.
How would you (need to) go about convincing a sufficient amount of the right people that this development poses an existential risk?
Would you subsequently try to convince them we should preemptively push for treaties, and aggressive upholding of those treaties, to prevent the annihilation of the human species? How would you get them to cooperate? Would you try to convince them to put as much effort as possible into a Manhattan project to develop an FAI that can subsequently prevent any other AI from becoming powerful enough to threaten them? Another approach?
I’m probably treading well trodden ground, but it seems to me that knowledge about AI safety is not what matters. What matters is convincing enough sufficiently powerful people that we need such knowledge before AGI becomes reality. Which should result in regulating AI development or urgently pushing for obtaining knowledge on AI safety or …
Without such people involved the net effect of the whole FAI community is a best effort skunkworks project attempting to uncover FAI knowledge, disseminate it as wide as possible and pray to god those first achieving AGI will actually make use of that knowledge. Or perhaps attempting to beat Google, the NSA or China to it. That seems like a hell of a gamble to me and although much more within the comfort zone of the community, vastly less likely to succeed than convincing Important People.
But I admit that I am clueless as to how that should be done. It’s just that it makes “set aside three years of your life to invest in AI safety research” ring pretty desperate and suboptimal to me.
But I admit that I am clueless as to how that should be done. It’s just that it makes “set aside three years of your life to invest in AI safety research” ring pretty desperate and suboptimal to me.
I think this sentence actually contains my own answer, basically. I didn’t say “invest three years of your life in AI safety research.” (I realize looking back that I didn’t clearly *not* say that, so this misunderstanding is on me and I’ll consider rewriting that section). What I meant to say was:
Get three years of runway (note: this does not mean you’re quitting your job for three years, it means that you have 3 years of runway so you can quit your job for 1 or 2 years before starting to feel antsy about not having enough money)
Quit your job or arrange your life such that you have to time to think clearly
figure out what’s going on (this involves keeping up on industry trends and understanding them well enough to know what they mean, keeping up on AI safety community discourse, following relevant bits of politics in both government, corporations, etc)
figure out what to do (including what skills you need to gain in order to be able to do it)
do it
i.e, the first step is to become not clueless. And then step 2 depends a lot on your existing skillset. I specifically am not saying to go into AI safety research (although I realize it may have looked that way). I’m asserting that some minimum threshold of technical literacy is necessary make serious contributions in any domain.
Do you want to persuade powerful people to help? You’ll need to know what you’re talking about.
Do you want to direct funding to the right places? You need to understand what’s going on well enough to know what needs funding.
Do you want to just be a cog in an organization where you mostly just work like a normal person but are helping move progress forward? You’ll need to know what’s going on enough to pick an organization where you’ll be a marginally beneficial cog.
The question isn’t “what is the optimal thing for AI risk people collectively to do”. It’s “what is the optimal thing for you in particular to do, given that the AI risk community exist.” In the past 10 years, the AI risk community has gone from a few online discussion groups to a collection of orgs that have millions of funding in current dollars; funders who have millions or billions more; as of this week, Henry Kissinger endorsing AI risk as important.
In that context, figuring out “what the best marginal contribution you personally can make to one of the most important problems humanity will face” is a difficult question.
The thesis of this post is that taking that question seriously requires a lot of time to think, and that because money is less of a limiting bottleneck now, you are more useful on the margin as a person who has carved out enough to time to think seriously than as an Earning-to-Give person.
If you’re not saying to go into AI safety research, what non-business-as-usual course of action are you expecting? Is your premise that everyone taking this seriously should figure out their comparative advantage within an AI risk organization because they contain many non-researcher roles, or are you imagining some potential course of action outside of “Give your time/money to MIRI/HCAI/etc”?
Is your premise that everyone taking this seriously should figure out their comparative advantage within an AI risk organization because they contain many non-researcher roles
Yes, basically. One of the specific possibilities I alluded to was taking on managerial or entreprenerial roles, here:
So people like me can’t just hand complicated assignments off and trust they get done competently. Someone might understand the theory but not get the political nuances they need to do something useful with the theory. Or they get the political nuances, and maybe get the theory at-the-time, but aren’t keeping up with the evolving technical landscape.
The thesis of the post is intended to be ‘donating to MIRI/CHAI etc is not the most useful thing you can be doing’
I think that we could increase the proportion of LWers actually doing something about this via positive social expectation: peer-centric goal-setting and feedback. Positive social expectation (as I’ve taken to calling it) is what happens when you agree to meet a friend at the gym at 5 - you’re much more likely to honor a commitment to a friend than one to yourself. I founded a student group to this effect at my undergrad and am currently collaborating with my university (I highly recommend reading the writeup) to implement it on a larger scale.
Basically, we could have small groups of people checking in once a week for half an hour. Each person briefly summarizes their last week and what they want to do in the next week; others can share their suggestions. Everyone sets at least one habit goal (stop checking email more than once a day) and one performance goal (read x chapters of set theory, perhaps made possible by improved methodology suggested by more-/differently-experienced group members).
I believe that the approach has many advantages over having people self-start:
lowered psychological barrier to getting started on x-risk (all they have to do is join a group; they see other people who aren’t (already) supergeniuses like Eliezer doing work, so they feel better about their own influence)
higher likelihood of avoiding time / understanding sinks (bad / unnecessary textbooks)
increased instrumental rationality
lower likelihood of burnout / negative affect spirals / unsustainable actions being taken
a good way to form friendships in the LW community
robust way to get important advice (not found in the Sequences) to newer people that may not be indexed under the keywords people initially think to search for
The downside is the small weekly time commitment.
I’ll probably make a post on this soon, and perhaps even a sequence on Actually Getting Started (as I’ve been reorganizing my life with great success).
“Make a meetup” is indeed one of my favorite rationalsphere hammers. So I’m sympathetic to this approach. But I gradually experienced a few issues that make me skeptical about this:
1) Social commitment devices are very fragile. In my experience, as soon as one buddy doesn’t show up to the gym once, it rapidly spirals into ineffectiveness. Building habits with an internal locus of control is very important to gaining real habits and skills.
I think the social commitment device can be useful to get started, but I think you should very rapidly try to evolve such that you don’t need it
2) I think x-risk really desperately needs people who already have the “I can self-start on my own” and “I can think usefully for myself” properties.
The problem is that there is little that needs doing that can be done by people who don’t already have those skills. The AI safety field keeps having people show up who say “I want to help”, and then it turns out not to be that easy to help, so those people sort of shrug and go back to their lives.
And the issue is that the the people who are involved do need help, but it requires a lot of context, and giving people context requires a lot of time (i.e. have several lengthy conversations over several weeks, or working together on a project), and then that time is precious.
And if it then turns out that the person they’re basically mentoring doesn’t have the “I can self start, self motivate, and think for myself” properties, then the mentor hasn’t gained an ally—they’ve gained a new obligation to take care of, or spend energy checking in on, or they just wasted their time.
I think a group like you describe could be useful, but there are lot of ways for it to be ineffectual if not done carefully. I may have more thoughts later.
One more thought -
I think this can be nearly completely solved by using a method detailed in Decisive—expectation-setting. I remember that employers found that warning potential employees of the difficulty and frustration involved with the job, retention skyrocketed. People (mostly) weren’t being discouraged from the process, but having their expectations set properly actually made them not mind the experience.
I agree. At uni, the idea is that it gets people into a framework where they’re able to get started, even if they aren’t self-starters. Here, one of the main benefits would be that people at various stages of the pipeline could share what worked and what didn’t. For example, knowing that understanding one textbook is way easier if you’ve already learned a prereq is valuable information, and that doesn’t seem to always be trivially-knowable ex ante. The onus is less on the social commitment and more on the “team of people working to learn AI Safety fundamentals”.
Agreed. I’m not looking to make the filter way easier to pass, but rather to encourage people to keep working. “I can self-start” is necessary, but I don’t think we can expect everyone to be able to self-motivate indefinitely in the face of a large corpus of unfamiliar technical material. Sure, a good self-starter will reboot eventually, but it’s better to have lightweight support structures that maintain a smooth rate of progress.
Additionally, my system 1 intuition is that there are people close to the self-starter threshold who are willing to work on safe AI, and that these people can be transformed into grittier, harder workers with the right structure. Maybe that’s not even worth our time, but it’s important to keep in mind the multiplicative benefits possible from actually getting more people involved. I could also be falling prey to the typical mind fallacy, as I only got serious when my worry overcame my doubts of being able to do anything.
Perhaps a more beneficial structure than “one experienced person receives lots of obligations” could be “three pairs of people (all working on learning different areas of the syllabus at any given time) share insights they picked up in previous iterations”. Working in pairs could spike efficiency relative to working alone due to each person making separate mistakes; together, they smooth over rough spots in their learning. I remember this problem being discussed in a post a few years back about how most of the poster’s autodidact problems were due to trivial errors that weren’t easily fixable by someone not familiar with that specific material.
FWIW, relatedly on the object-level, there’s already a weekly AI safety reading group which people can join.
Maybe this post by Nate Soares?
I think that we could increase the proportion of LWers actually doing something about this via public discussions on LW about AI related issues, new developments, new evidence, and posts offering the readers to think about certain challenges (even if they don’t already think the challenges are critical). That makes sense right?
If I look for AI related posts over the last month, I see a few, but a lot of them are meta issues, and in general, I don’t see anything that could force an unconvinced reader to update in either direction.
This is based on the premise that there are many people on LW who are familiar with the basic arguments, and would be able to engage in some meaningful work, but don’t find the arguments all that convincing. Note, you need to convince people not only that AI risk exists (a trivial claim), but that it’s more likely than, e.g. an asteroid impact (honestly, I don’t think I’ve seen an argument in that spirit).
I think that could be a good idea, too. The concern is whether there is substantial meaningful (and non-technical) discussion left to be had. I haven’t been on LW very long, but in my time here it has seemed like most people agree on FAI being super important. This topic seems (to me) to have already been well-discussed, but perhaps that was in part because I was searching out that content.
For many people, the importance hasn’t reached knowing, on a gut level, that unfriendly AI can plausibly, and will probably (in longer timescales, at the very minimum), annihilate everything we care about, to a greater extent than even nuclear war. That is—if we don’t act. That’s a hell of a claim to believe, and it takes even more to be willing to do something.
Yes, it is, and it should take a hell of an argument to make someone believe this. The fact that many people don’t quite believe this (on a gut level, at least), suggests that there are still many arguments to be made (alternatively, it might not be true).
There are many people here who believe it, who spend a lot of time thinking about it and who also happen to be active users in LW, which might skew your perception of the average user. I suspect that there are many people who find the arguments a little fishy, but don’t quite know what’s wrong with them. At the very least, there is me.
I’m also in this position at the moment, but in part due to this post I now plan to spend significant time (at least 5 days, I.e. 40 hrs, cumulatively) doing deep reflection on timelines this summer, with the goal of making my model detailed enough to make life decisions based on it. (Consider this my public precommitment to do so; I will at minimum post a confirmation of whether or not I have done so on my shortform feed by August 15th, and if it seems useful I may also post a writeup of my reflections).
Suppose the Manhattan project was currently in progress, meaning we somehow had the internet, mobile phones, etc. but not nuclear bombs. You are a smart physicist that keeps up with progress in many areas of physics and at some point you realize the possibility of a nuclear bomb. You also foresee the existential risk this poses.
You manage to convince a small group of people of this, but many people are skeptical and point out the technical hurdles that would need to be overcome, and political decisions that would need to be taken, for the existential risk to become reality. They think it will all blow over and work itself out. And most people fail to grasp enough of the details to have a rational opinion about the topic.
How would you (need to) go about convincing a sufficient amount of the right people that this development poses an existential risk?
Would you subsequently try to convince them we should preemptively push for treaties, and aggressive upholding of those treaties, to prevent the annihilation of the human species? How would you get them to cooperate? Would you try to convince them to put as much effort as possible into a Manhattan project to develop an FAI that can subsequently prevent any other AI from becoming powerful enough to threaten them? Another approach?
I’m probably treading well trodden ground, but it seems to me that knowledge about AI safety is not what matters. What matters is convincing enough sufficiently powerful people that we need such knowledge before AGI becomes reality. Which should result in regulating AI development or urgently pushing for obtaining knowledge on AI safety or …
Without such people involved the net effect of the whole FAI community is a best effort skunkworks project attempting to uncover FAI knowledge, disseminate it as wide as possible and pray to god those first achieving AGI will actually make use of that knowledge. Or perhaps attempting to beat Google, the NSA or China to it. That seems like a hell of a gamble to me and although much more within the comfort zone of the community, vastly less likely to succeed than convincing Important People.
But I admit that I am clueless as to how that should be done. It’s just that it makes “set aside three years of your life to invest in AI safety research” ring pretty desperate and suboptimal to me.
I think this sentence actually contains my own answer, basically. I didn’t say “invest three years of your life in AI safety research.” (I realize looking back that I didn’t clearly *not* say that, so this misunderstanding is on me and I’ll consider rewriting that section). What I meant to say was:
Get three years of runway (note: this does not mean you’re quitting your job for three years, it means that you have 3 years of runway so you can quit your job for 1 or 2 years before starting to feel antsy about not having enough money)
Quit your job or arrange your life such that you have to time to think clearly
figure out what’s going on (this involves keeping up on industry trends and understanding them well enough to know what they mean, keeping up on AI safety community discourse, following relevant bits of politics in both government, corporations, etc)
figure out what to do (including what skills you need to gain in order to be able to do it)
do it
i.e, the first step is to become not clueless. And then step 2 depends a lot on your existing skillset. I specifically am not saying to go into AI safety research (although I realize it may have looked that way). I’m asserting that some minimum threshold of technical literacy is necessary make serious contributions in any domain.
Do you want to persuade powerful people to help? You’ll need to know what you’re talking about.
Do you want to direct funding to the right places? You need to understand what’s going on well enough to know what needs funding.
Do you want to just be a cog in an organization where you mostly just work like a normal person but are helping move progress forward? You’ll need to know what’s going on enough to pick an organization where you’ll be a marginally beneficial cog.
The question isn’t “what is the optimal thing for AI risk people collectively to do”. It’s “what is the optimal thing for you in particular to do, given that the AI risk community exist.” In the past 10 years, the AI risk community has gone from a few online discussion groups to a collection of orgs that have millions of funding in current dollars; funders who have millions or billions more; as of this week, Henry Kissinger endorsing AI risk as important.
In that context, figuring out “what the best marginal contribution you personally can make to one of the most important problems humanity will face” is a difficult question.
The thesis of this post is that taking that question seriously requires a lot of time to think, and that because money is less of a limiting bottleneck now, you are more useful on the margin as a person who has carved out enough to time to think seriously than as an Earning-to-Give person.
If you’re not saying to go into AI safety research, what non-business-as-usual course of action are you expecting? Is your premise that everyone taking this seriously should figure out their comparative advantage within an AI risk organization because they contain many non-researcher roles, or are you imagining some potential course of action outside of “Give your time/money to MIRI/HCAI/etc”?
Yes, basically. One of the specific possibilities I alluded to was taking on managerial or entreprenerial roles, here:
The thesis of the post is intended to be ‘donating to MIRI/CHAI etc is not the most useful thing you can be doing’