People like Ezra Klein are hearing Eliezer and rolling his position into their own more palatable takes. I really don’t think it’s necessary for everyone to play that game, it seems really good to have someone out there just speaking honestly, even if they’re far on the pessimistic tail, so others can see what’s possible. 4D chess here seems likely to fail.
Also, there’s the sentiment going around that normies who hear this are actually way more open to the simple AI Safety case than you’d expect, we’ve been extrapolating too much from current critics. Tech people have had years to formulate rationalizations and reassure one another they are clever skeptics for dismissing this stuff. Meanwhile regular folks will often spout off casual proclamations that the world is likely ending due to climate change or social decay or whatever, they seem to err on the side of doomerism as often as the opposite. The fact that Eliezer got published in TIME is already a huge point in favor of his strategy working.
EDIT: Case in point! Met a person tonight, completely offline rural anti-vax astrology doesn’t-follow-the-news type of person, I said the word AI and immediately she says she thinks “robots will eventually take over”. I understand this might not be the level of sophistication we’d desire, but at least be aware that raw material is out there. No idea how it’ll play out, but 4d chess still seems like a mistake, let Yud speak his truth.
Meanwhile regular folks will often spout off casual proclamations that the world is likely ending due to climate change or social decay or whatever, they seem to err on the side of doomerism as often as the opposite.
This is not a good thing, under my model, given that I don’t agree with doomerism.
For mindset, I agree that doomerism isn’t good, primarily because it can close your mind off of real solutions to a problem, and make you over update to the overly pessimistic view.
As a factual statement, I also disagree with high p(Doom) probabilities, and I have a maximum of 10%, if not lower.
For object level arguments for why I disagree with the doom take, here’s the arguments:
I disagree with the assumption of Yudkowskians that certain abstractions just don’t scale well when we crank them up in capabilities. I remember a post that did interpretability on AlphaZero and found it has essentially human interpretable abstractions, which at least for the case of Go disproved that Yudkowskian notion.
I am quite a bit more optimistic on scalable alignment than many in the LW community, and in the case of recent work, showed that as AI got more data, it got more aligned with human goals. There are many other benefits in the recent work, but the fact that they showed that as a certain capability scaled up, alignment scaled up, means that the trend of alignment is positive, and more capable models will probably be more aligned.
Finally, trend lines. There’s a saying that’s inspired by the Atomic Habits book: The trend line matters more than how much progress you make in a single sitting. And in the case of alignment, that trend line is positive but slow, which means we are in a extremely good position to speed up that trend. It also means we should be far less worried about doom, as we just have to increase the trend line of alignment progress and wait.
Edit: My first point is at best, partially correct, and may need to be removed altogether due to a new paper called Adversarial Policies Beat Superhuman Go AIs.
I’ll admit, that is a fairly big blow to my first point, though the rest of my points stand. I’ll edit the comment to mention your debunking of my first point.
I think that a mindset considered ‘poor’ would imply that it causes one to arrive at false conclusions more often.
If doomerism isn’t a good mindset, it should also—besides making one simply depressed and fearful / pessimistic about the future—be contradicted by empirical data, and the flow of events throughout time.
Personally, I think it’s pretty easy to show that pessimism (belief that certain objectives are impossible or doomed to cause catastrophic, unrecoverable failure) is wrong. Furthermore, and even more easily argued than that, is that belief that one’s objective is unlikely or impossible cannot cause one to be more likely to achieve it. I would define ‘poor’ mindsets to be equivalent to the latter to some significant degree.
People like Ezra Klein are hearing Eliezer and rolling his position into their own more palatable takes. I really don’t think it’s necessary for everyone to play that game, it seems really good to have someone out there just speaking honestly, even if they’re far on the pessimistic tail, so others can see what’s possible. 4D chess here seems likely to fail.
https://steno.ai/the-ezra-klein-show/my-view-on-ai
Also, there’s the sentiment going around that normies who hear this are actually way more open to the simple AI Safety case than you’d expect, we’ve been extrapolating too much from current critics. Tech people have had years to formulate rationalizations and reassure one another they are clever skeptics for dismissing this stuff. Meanwhile regular folks will often spout off casual proclamations that the world is likely ending due to climate change or social decay or whatever, they seem to err on the side of doomerism as often as the opposite. The fact that Eliezer got published in TIME is already a huge point in favor of his strategy working.
EDIT: Case in point! Met a person tonight, completely offline rural anti-vax astrology doesn’t-follow-the-news type of person, I said the word AI and immediately she says she thinks “robots will eventually take over”. I understand this might not be the level of sophistication we’d desire, but at least be aware that raw material is out there. No idea how it’ll play out, but 4d chess still seems like a mistake, let Yud speak his truth.
This is not a good thing, under my model, given that I don’t agree with doomerism.
You disagree with doomerism as a mindset, or factual likelihood? Or both?
I think doomerism as a mindset isn’t great, but in terms of likelihood, there are ~3 things likely to kill humanity atm. AI being the first.
Both as a mindset and as a factual likelihood.
For mindset, I agree that doomerism isn’t good, primarily because it can close your mind off of real solutions to a problem, and make you over update to the overly pessimistic view.
As a factual statement, I also disagree with high p(Doom) probabilities, and I have a maximum of 10%, if not lower.
For object level arguments for why I disagree with the doom take, here’s the arguments:
I disagree with the assumption of Yudkowskians that certain abstractions just don’t scale well when we crank them up in capabilities. I remember a post that did interpretability on AlphaZero and found it has essentially human interpretable abstractions, which at least for the case of Go disproved that Yudkowskian notion.
I am quite a bit more optimistic on scalable alignment than many in the LW community, and in the case of recent work, showed that as AI got more data, it got more aligned with human goals. There are many other benefits in the recent work, but the fact that they showed that as a certain capability scaled up, alignment scaled up, means that the trend of alignment is positive, and more capable models will probably be more aligned.
Finally, trend lines. There’s a saying that’s inspired by the Atomic Habits book: The trend line matters more than how much progress you make in a single sitting. And in the case of alignment, that trend line is positive but slow, which means we are in a extremely good position to speed up that trend. It also means we should be far less worried about doom, as we just have to increase the trend line of alignment progress and wait.
Edit: My first point is at best, partially correct, and may need to be removed altogether due to a new paper called Adversarial Policies Beat Superhuman Go AIs.
Link below:
https://arxiv.org/abs/2211.00241
All other points stand.
Recent Adversarial Policies Beat Superhuman Go AIs seem to plant doubt how well abstractions generalize in the case of Go.
I’ll admit, that is a fairly big blow to my first point, though the rest of my points stand. I’ll edit the comment to mention your debunking of my first point.
I think that a mindset considered ‘poor’ would imply that it causes one to arrive at false conclusions more often.
If doomerism isn’t a good mindset, it should also—besides making one simply depressed and fearful / pessimistic about the future—be contradicted by empirical data, and the flow of events throughout time.
Personally, I think it’s pretty easy to show that pessimism (belief that certain objectives are impossible or doomed to cause catastrophic, unrecoverable failure) is wrong. Furthermore, and even more easily argued than that, is that belief that one’s objective is unlikely or impossible cannot cause one to be more likely to achieve it. I would define ‘poor’ mindsets to be equivalent to the latter to some significant degree.