In my view, the main good outcomes of the AI transition are 1) we luck out, AI x-safety is actually pretty easy across all the subproblems 2) there’s an AI pause, humans get smarter via things like embryo selection, then solve all the safety problems.
I’m mainly pushing for #2, but also don’t want to accidentally make #1 less likely. It seems like one of the main ways in which I could end up having a negative impact is to persuade people that the problems are definitely too hard and hence not worth trying to solve, and it turns out the problems could have been solved with a little more effort.
“it doesn’t seem like you have answers to (or even a great path forward on) these questions either despite your great interest in and effort spent on them, which bodes quite terribly for the rest of us” is a bit worrying from this perspective, and also because my “effort spent on them” isn’t that great. As I don’t have a good approach to answering these questions, I mainly just have them in the back of my mind while my conscious effort is mostly on other things.
BTW I’m curious what your background is and how you got interested/involved in AI x-safety. It seems rare for newcomers to the space (like you seem to be) to quickly catch up on all the ideas that have been developed on LW over the years, and many recently drawn to AGI instead appear to get stuck on positions/arguments from decades ago. For example, r/Singularity has 2.5M members and seems to be dominated by accelerationism. Do you have any insights about this? (How were you able to do this? How to help others catch up? Intelligence is probably a big factor which is why I’m hoping that humanity will automatically handle these problems better once it gets smarter, but many seem plenty smart and still stuck on primitive ideas about AI x-safety.)
It seems like one of the main ways in which I could end up having a negative impact is to persuade people that the problems are definitely too hard and hence not worth trying to solve, and it turns out the problems could have been solved with a little more effort.
This isn’t unreasonable, but in order for this to be a meaningful concern, it’s possible that you would need to be close to enough people working on this topic to the point where you misleading them would have a nontrivial impact. And I guess this… just doesn’t seem to be the case (at least to an outsider like me)? Even otherwise interested and intelligent people are focusing on other stuff, and while I suppose there may be some philosophy PhD’s at OpenPhil or the Future of Humanity Institute (RIP) who are thinking about such matters, they seem few and far between.
That is to say, it sure would be nice if we got to a point where the main concern was “Wei Dai is unintentionally misleading people working on this issue” instead of “there are just too few people working on this to produce useful results even absent any misleadingness”.
My path to getting here is essentially the following:
reading Ezra Klein and Matt Yglesias because they seem saner and more policy-focused than other journalists → Yglesias writes an interesting blog post in defense of the Slate Star Codex, which I had heard of before but had never really paid much attention to → I start reading the SSC out of curiosity, and I am very impressed by how almost every post is interesting, thoughtful, and gives me insights which I had never considered but which seemed obvious in retrospect → Scott occasionally mentions this “rationality community” kinda-sorta centered around LW → I start reading LW in earnest, and I enjoy the high quality of the posts (and especially of the discussions in the comments), but I mostly avoid the AI risk stuff because it seems scary and weird; I also read HPMOR, which I find to be very fun but not necessarily well-written → I mess around with the beta version of ChatGPT around September 2022 and I am shocked by how advanced and coherent the LLM seems → I realize the AI stuff is really important and I need to get over myself and actually take it seriously
If I had to say what allowed me to reach this point, I would say the following properties, listed in no particular order, were critical (actually writing this list out feels kinda self-masturbatory, but oh well):
I was non-conformist enough to not immediately bounce off Scott’s writing or off of LW (which contains some really strange, atypical stuff)
I loved mathematics so much that I wasn’t thrown off byeverything on this site (and in fact embraced the applications of mathematical ideas in everything)
I cared about philosophy, so I didn’t shy away from meta discussions and epistemology and getting really into the weeds of confusing topics like personal identity, agency, values and preferences etc
I had enough self-awareness to not become a parody of myself and to figure out what the important topics were instead of circlejerking over rationality or getting into rational fiction or other stuff like that
I was sufficiently non-mindkilled that I was able to remain open to fundamental shifts in understanding and to change my opinion in crucial ways
My sanity was resilient enough that I didn’t just run away or veer off in crazy directions when confronted with the frankly really scary AI stuff
I was intelligent enough to understand (at least to some extent) what is going on and to think critically about these matters
I was obsessive and focused enough to read everything available on LW (and on other related sites) about AI without getting bored or tired over time.
The problem is that (at least from my perspective) all of these qualities were necessary in order for me to follow the path I did. I believe that if any single one of them had been removed while the other 7 remained, I would not have ended up caring about AI safety. As a sample illustration of what can go wrong when points 1 and 6 aren’t sufficiently satisfied but everything else is, you can check out what happened with Qiaochu (also, you probably knew him personally?, so there’s that too).
As you can imagine, this strongly limits the supply of people that can think sanely and meaningfully about AI safety topics. The need to satisfy points 1, 5, and 7 above already lowers the population pool tremendously, and there are still all the other requirements to get through.
For example, r/Singularity has 2.5M members and seems to be dominated by accelerationism.
Man, these guys… I get the impression that they are mindkilled in a very literal political sense. They seem to desperately await the arrival of the glorious Singularity that will free them from the oppression and horrors of Modernity and Capitalism. Of course, the fact that they are living better material lives than 99.9% of humans that have ever existed doesn’t seem to register, and I guess you can call this the epitome of entitlement and privilege.
But I don’t think most of them are bad people (in so far as it even makes sense to call a person bad). They just live in a very secure epistemic bubble that filters everything they read and think about and which prevents them from ever touching reality. I’ve written similar stuff about this before:
society has always done, is doing, and likely will always do a tremendous job of getting us to self-sort into groups of people that are very similar to us in terms of culture, social and aesthetic preferences, and political leanings. This process has only been further augmented by the rise of social media and the entrenchment of large and self-sustaining information bubbles. In broad terms, people do not like to talk about the downsides of their proposed policies or general beliefs, and even more importantly, they do not communicate these downsides to other members of the bubble. Combined with the present reality of an ever-growing proportion of the population that relies almost entirely on the statements and attitudes of high-status members of the in-group as indicators of how to react to any piece of news, this leads to a rather remarkable equilibrium, in which otherwise sane individuals genuinely believe that the policies and goals they propose have 100% upside and 0% downside, and the only reason they don’t get implemented in the real world is because of wicked and stupid people on the other side who are evil or dumb enough to support policies that have 100% downside and 0% upside. Trade-offs are an inevitable consequence of any discussion about meaningful changes to our existing system; simple Pareto improvements are extremely rare. However, widespread knowledge or admission of the existence of trade-offs does not just appear out of nowhere; in order for this reality to be acknowledged, it must be the case that people are exposed to counterarguments (or at the very least calls for caution) to the most extreme versions of in-group beliefs by trusted members of the in-group because everyone else will be ignored or dismissed as a bad-faith supporters of the opposition. Due to the dynamic mentioned earlier, this happens less and less, and beliefs get reinforced into becoming more and more extreme. Human beings thus end up with genuine (although self-serving and biased) convictions and beliefs that a neutral observer could nonetheless readily identify as irrational or nonsensical. In the past, there used to be a moderating effect due to the much more shared nature of pop culture and group identity: if you were already predisposed to adopt extreme views, it was unlikely for you to find other similarly situated people in your neighborhood or coalition, as most groups you could belong to were far more mainstream and thus moderate. But now the Internet allows you to turn all that on its head with just a few mouse clicks; after all, no matter what intuition you may have about any slightly popular topic, there is very likely some community out there ready to take you in and tell you how smart and brave you are for thinking the right thoughts and not being one of the crazy, bad people who disagree.
The sanity waterline is very low and only getting lower. As lc has said, “the vast majority of people alive today are the effective mental subjects of some religion, political party, national identity, or combination of the three”. I would have hoped that CFAR was trying to solve that, but that apparently was not close to being true even though it was repeatedly advertised as aiming to “help people develop the abilities that let them meaningfully assist with the world’s most important problems, by improving their ability to arrive at accurate beliefs, act effectively in the real world, and sustainably care about that world” by “widen[ing] the bottleneck on thinking better and doing more.” I guess the actual point of CFAR (there was a super long twitter thread by Qiaochu on this at some point) was to give the appearance of being about rationality while the underlying goal was to nerd-snipe young math-inclined students to go work on mathematical alignment at MIRI? Anyway, I’m slightly veering off-topic.
How to help others catch up?
I don’t have a good answer to this question. Due to the considerations mentioned earlier, the most effective short-term way to get people who could contribute anything useful into AI safety is through selective rather than corrective or structural means, but there are just too few people who fit the requirements for this to scale nicely.
Over the long-term, you can try to reverse the trends on general societal inadequacy and sanity, but this seems really hard, it should have been done 20 years ago, and in any case requires actual decades before you can get meaningful outputs.
I’ll think about this some more and I’ll let you know if I have anything else to say.
Thanks for your insightful answers. You may want to make a top-level post on this topic to get more visibility. If only a very small fraction of the world is likely to ever understand and take into account many important ideas/considerations about AI x-safety, that changes the strategic picture considerably, and people around here may not be sufficiently “pricing it in”. I think I’m still in the process of updating on this myself.
Having more intelligence seems to directly or indirectly improve at least half of the items on your list. So doing an AI pause and waiting for (or encouraging) humans to become smarter still seems like the best strategy. Any thoughts on this?
And I guess this… just doesn’t seem to be the case (at least to an outsider like me)?
I may be too sensitive about unintentionally causing harm, after observing many others do this. I was also just responding to what you said earlier, where it seemed like I was maybe causing you personally to be too pessimistic about contributing to solving the problems.
you probably knew him personally?
No, I never met him and didn’t interact online much. He does seem like a good example of you’re talking about.
Could that just shift the problem a bit? If we get a class of really smart people they can subjugate everyone else pretty easily too—perhaps even better than some AGI as they start with a really good understanding of human nature, cultures, failing and how to exploit for their own purposes. Or they could simply be better suited to taking advantage of and surviving with a more dangerous AI on the loose. We end up in some hybrid world where humanity is not extinct but most peoples’ life is pretty poor.
I suppose one might say that the speed and magnitude of the advances here might be such that we get to corrigible AI before we get incorrigible super humans.
I’m currious about your thought.
Quick, caveate, I’m trying to say all futures are bleak and no efforts lead where we want. I’m actually pretty positive about our future, even with AI (perhaps naively). We clearly already live in a world where the most intelligent could be said to “rule” but the rest of us average Joes are not slaves or surfs everywhere. Where the problems exist is more where we have cultural and legal failings rather than just outright subjugation by the brighter bulbs. But going back to the darker side here, the one’s that tend to successfully exploit/game/or ignore the rules are the smarter ones in the room.
If governments subsidize embryo selection, we should get a general uplift of everyone’s IQ (or everyone who decides to participate) so the resulting social dynamics shouldn’t be too different from today’s. Repeat that for a few generations, then build AGI (or debate/decide what else to do next). That’s the best scenario I can think of (aside from the “we luck out” ones).
In my view, the main good outcomes of the AI transition are 1) we luck out, AI x-safety is actually pretty easy across all the subproblems 2) there’s an AI pause, humans get smarter via things like embryo selection, then solve all the safety problems.
I’m mainly pushing for #2, but also don’t want to accidentally make #1 less likely. It seems like one of the main ways in which I could end up having a negative impact is to persuade people that the problems are definitely too hard and hence not worth trying to solve, and it turns out the problems could have been solved with a little more effort.
“it doesn’t seem like you have answers to (or even a great path forward on) these questions either despite your great interest in and effort spent on them, which bodes quite terribly for the rest of us” is a bit worrying from this perspective, and also because my “effort spent on them” isn’t that great. As I don’t have a good approach to answering these questions, I mainly just have them in the back of my mind while my conscious effort is mostly on other things.
BTW I’m curious what your background is and how you got interested/involved in AI x-safety. It seems rare for newcomers to the space (like you seem to be) to quickly catch up on all the ideas that have been developed on LW over the years, and many recently drawn to AGI instead appear to get stuck on positions/arguments from decades ago. For example, r/Singularity has 2.5M members and seems to be dominated by accelerationism. Do you have any insights about this? (How were you able to do this? How to help others catch up? Intelligence is probably a big factor which is why I’m hoping that humanity will automatically handle these problems better once it gets smarter, but many seem plenty smart and still stuck on primitive ideas about AI x-safety.)
This isn’t unreasonable, but in order for this to be a meaningful concern, it’s possible that you would need to be close to enough people working on this topic to the point where you misleading them would have a nontrivial impact. And I guess this… just doesn’t seem to be the case (at least to an outsider like me)? Even otherwise interested and intelligent people are focusing on other stuff, and while I suppose there may be some philosophy PhD’s at OpenPhil or the Future of Humanity Institute (RIP) who are thinking about such matters, they seem few and far between.
That is to say, it sure would be nice if we got to a point where the main concern was “Wei Dai is unintentionally misleading people working on this issue” instead of “there are just too few people working on this to produce useful results even absent any misleadingness”.
My path to getting here is essentially the following:
If I had to say what allowed me to reach this point, I would say the following properties, listed in no particular order, were critical (actually writing this list out feels kinda self-masturbatory, but oh well):
I was non-conformist enough to not immediately bounce off Scott’s writing or off of LW (which contains some really strange, atypical stuff)
I loved mathematics so much that I wasn’t thrown off by everything on this site (and in fact embraced the applications of mathematical ideas in everything)
I cared about philosophy, so I didn’t shy away from meta discussions and epistemology and getting really into the weeds of confusing topics like personal identity, agency, values and preferences etc
I had enough self-awareness to not become a parody of myself and to figure out what the important topics were instead of circlejerking over rationality or getting into rational fiction or other stuff like that
I was sufficiently non-mindkilled that I was able to remain open to fundamental shifts in understanding and to change my opinion in crucial ways
My sanity was resilient enough that I didn’t just run away or veer off in crazy directions when confronted with the frankly really scary AI stuff
I was intelligent enough to understand (at least to some extent) what is going on and to think critically about these matters
I was obsessive and focused enough to read everything available on LW (and on other related sites) about AI without getting bored or tired over time.
The problem is that (at least from my perspective) all of these qualities were necessary in order for me to follow the path I did. I believe that if any single one of them had been removed while the other 7 remained, I would not have ended up caring about AI safety. As a sample illustration of what can go wrong when points 1 and 6 aren’t sufficiently satisfied but everything else is, you can check out what happened with Qiaochu (also, you probably knew him personally?, so there’s that too).
As you can imagine, this strongly limits the supply of people that can think sanely and meaningfully about AI safety topics. The need to satisfy points 1, 5, and 7 above already lowers the population pool tremendously, and there are still all the other requirements to get through.
Man, these guys… I get the impression that they are mindkilled in a very literal political sense. They seem to desperately await the arrival of the glorious Singularity that will free them from the oppression and horrors of Modernity and Capitalism. Of course, the fact that they are living better material lives than 99.9% of humans that have ever existed doesn’t seem to register, and I guess you can call this the epitome of entitlement and privilege.
But I don’t think most of them are bad people (in so far as it even makes sense to call a person bad). They just live in a very secure epistemic bubble that filters everything they read and think about and which prevents them from ever touching reality. I’ve written similar stuff about this before:
The sanity waterline is very low and only getting lower. As lc has said, “the vast majority of people alive today are the effective mental subjects of some religion, political party, national identity, or combination of the three”. I would have hoped that CFAR was trying to solve that, but that apparently was not close to being true even though it was repeatedly advertised as aiming to “help people develop the abilities that let them meaningfully assist with the world’s most important problems, by improving their ability to arrive at accurate beliefs, act effectively in the real world, and sustainably care about that world” by “widen[ing] the bottleneck on thinking better and doing more.” I guess the actual point of CFAR (there was a super long twitter thread by Qiaochu on this at some point) was to give the appearance of being about rationality while the underlying goal was to nerd-snipe young math-inclined students to go work on mathematical alignment at MIRI? Anyway, I’m slightly veering off-topic.
I don’t have a good answer to this question. Due to the considerations mentioned earlier, the most effective short-term way to get people who could contribute anything useful into AI safety is through selective rather than corrective or structural means, but there are just too few people who fit the requirements for this to scale nicely.
Over the long-term, you can try to reverse the trends on general societal inadequacy and sanity, but this seems really hard, it should have been done 20 years ago, and in any case requires actual decades before you can get meaningful outputs.
I’ll think about this some more and I’ll let you know if I have anything else to say.
Thanks for your insightful answers. You may want to make a top-level post on this topic to get more visibility. If only a very small fraction of the world is likely to ever understand and take into account many important ideas/considerations about AI x-safety, that changes the strategic picture considerably, and people around here may not be sufficiently “pricing it in”. I think I’m still in the process of updating on this myself.
Having more intelligence seems to directly or indirectly improve at least half of the items on your list. So doing an AI pause and waiting for (or encouraging) humans to become smarter still seems like the best strategy. Any thoughts on this?
I may be too sensitive about unintentionally causing harm, after observing many others do this. I was also just responding to what you said earlier, where it seemed like I was maybe causing you personally to be too pessimistic about contributing to solving the problems.
No, I never met him and didn’t interact online much. He does seem like a good example of you’re talking about.
Could that just shift the problem a bit? If we get a class of really smart people they can subjugate everyone else pretty easily too—perhaps even better than some AGI as they start with a really good understanding of human nature, cultures, failing and how to exploit for their own purposes. Or they could simply be better suited to taking advantage of and surviving with a more dangerous AI on the loose. We end up in some hybrid world where humanity is not extinct but most peoples’ life is pretty poor.
I suppose one might say that the speed and magnitude of the advances here might be such that we get to corrigible AI before we get incorrigible super humans.
I’m currious about your thought.
Quick, caveate, I’m trying to say all futures are bleak and no efforts lead where we want. I’m actually pretty positive about our future, even with AI (perhaps naively). We clearly already live in a world where the most intelligent could be said to “rule” but the rest of us average Joes are not slaves or surfs everywhere. Where the problems exist is more where we have cultural and legal failings rather than just outright subjugation by the brighter bulbs. But going back to the darker side here, the one’s that tend to successfully exploit/game/or ignore the rules are the smarter ones in the room.
If governments subsidize embryo selection, we should get a general uplift of everyone’s IQ (or everyone who decides to participate) so the resulting social dynamics shouldn’t be too different from today’s. Repeat that for a few generations, then build AGI (or debate/decide what else to do next). That’s the best scenario I can think of (aside from the “we luck out” ones).