As a layperson, the problem has been that my ability to figure out what’s true relies on being able to evaluate subject-matter experts respective reliability on the technical elements of alignment. I’ve lurked in this community a long time; I’ve read the Sequences and watched the Robert Miles videos. I can offer a passing explanation of what the corrigibility problem is, or why ELK might be important.
None of that seems to count for much. Yitz made what I thought was a very lucid post from a similar level of knowledge, trying to bridge that gap, and got mostly answers that didn’t tell me (or as best I can tell, them) anything in concept I wasn’t already aware of, plus Eliezer himself being kind of hostile in response to someone trying to understand.
So here I find myself in the worst of both worlds; the apparent plurality of the LessWrong commentariat says I’m going to die and to maximise my chance to die with dignity I should quit my job and take out a bunch of loans try to turbo through an advanced degree in machine learning, and I don’t have the tools to evaluate whether they’re right.
I agree. I find myself in an epistemic state somewhat like: “I see some good arguments for X. I can’t think of any particular counter-argument that makes me confident that X is false. If X is true, it implies there are high-value ways of spending my time that I am not currently doing. Plenty of smart people I know/read believe X; but plenty do not”
It sounds like that should maybe be enough to coax me into taking action about X. But the problem is that I don’t think it’s that hard to put me in this kind of epistemic state. Eg, if I were to read the right blogs, I think I could be brought into that state for a bunch of different values of X. A few of the top of my head that seem plausible:
Climate change
Monetary policy/hyperinflation
Animal suffering
So I don’t feel super trusting of my epistemic state. I guess I feel a sort of epistemic learned helplessness, where I am suspicious of smart bloggers’ ability to get me to think an issue is important and worth dedicating my life to.
Not totally sure how to resolve this, though I suppose it would involve some sort of “going off on my own and actually thinking deeply about what it should take to convince me”
I feel the same. I think there are just a lot of problems which one could try to solve/solve which are increasing the good in the world. The difference between alignment and the rest seems to be the probability at which humans will go extinct is much higher.
That part really shouldn’t be necessary (even if it may be rational, conditional on some assumptions). In the event that you do decide to devote your time to helping, whether for dignity or whatever else, you should be able to get funding to cover most reasonable forms of upskilling and/or seeing-if-you-can-help trial period.
That said, I think step one would be to figure out where your comparative advantage lies (80,000 hours folk may have thoughts, among others). Certainly some people should be upskilling in ML/CS/Math—though an advanced degree may not be most efficient -, but there are other ways to help.
I realize this doesn’t address the deciding-what’s-true aspect. I’d note there that I don’t think much detailed ML knowledge is necessary to follow Eliezer’s arguments on this. Most of the ML-dependent parts can be summarized as [we don’t know how to do X], [we don’t have any clear plan that we expect will tell us how to do X], similarly for Y, Z, [Either X, Y or Z is necessary for safe AGI].
Beyond that, I think you only need a low prior on our bumping into a good solution while fumbling in the dark and a low prior on sufficient coordination, and things look quite gloomy. Probably you also need to throw in some pessimism on getting safe AI systems to fundamentally improve our alignment research.
you should be able to get funding to cover most reasonable forms of upskilling and/or seeing-if-you-can-help trial period.
Hi Joe! I wonder if you have any pointers as to how to get help? I would like to try to help while being able to pay for rent and food. I think right now I’m may not br articulate enough to write grant proposals and get funding, so I think I could also use somebody to talk to to figure out what’s the most high impact thing I could do.
I wonder if you’d be willing to chat / know anybody who is?
Something like the 80,000 hours career advice seems like a good place to start—or finding anyone who has a good understanding of the range of possibilities (mine is a bit too narrowly slanted towards technical AIS).
If you’ve decided on the AIS direction, then AI Safety Support is worth a look—they do personal calls for advice, and have many helpful links.
That said, I wouldn’t let the idea of “grant proposals” put you off. The forms you’d need to fill for the LTFF are not particularly complicated, and they do give grants for e.g. upskilling—you don’t necessarily need a highly specific/detailed plan.
If you don’t have a clear idea where you might fit in, then the advice links above should help. If/when you do have a clear idea, don’t worry about whether you can articulate it persuasively. If it makes sense, then people will be glad to hear it—and to give you pointers (e.g. fund managers).
E.g. there’s this from Evan Hubinger (who helps with the LTFF):
if you have any idea of any way in which you think you could use money to help the long-term future, but aren’t currently planning on applying for a grant from any grant-making organization, I want to hear about it. Feel free to send me a private message on the EA Forum or LessWrong. I promise I’m not that intimidating :)
Also worth bearing in mind as a general principle that if almost everything you try succeeds, you’re not trying enough challenging things. Just make sure to take negative outcomes as useful information (often you can ask for specific feedback too). There’s a psychological balance to be struck here, but trying at least a little more than you’re comfortable with will generally expand your comfort zone and widen your options.
Thank you so much! I didn’t know 80k does advising! In terms of people with knowledge on the possibilities… I have a background and a career path that doesn’t end up giving me a lot of access to people who know, so I’ll definitely try to get help at 80k.
Also worth bearing in mind as a general principle that if almost everything you try succeeds, you’re not trying enough challenging things. Just make sure to take negative outcomes as useful information (often you can ask for specific feedback too). There’s a psychological balance to be struck here, but trying at least a little more than you’re comfortable with will generally expand your comfort zone and widen your options.
to maximise my chance to die with dignity I should quit my job and take out a bunch of loans try to turbo through an advanced degree in machine learning
You don’t have to have a degree in machine learning. Besides machine learning engineering or machine learning research there are plenty of other ways to help reduce existential risk from AI, such as:
software engineering at Redwood Research or Anthropic
Personally, my estimate of the probability of doom is much lower than Eliezer’s, but in any case, I think it’s worthwhile to carefully consider how to maximize your positive impact on the world, whether that involves reducing existential risk from AI or not.
I can’t help with the object level determination, but I think you may be overrating both the balance and import of the second-order evidence.
As far as I can tell, Yudkowsky is a (?dramatically) pessimistic outlier among the class of “rationalist/rationalist-adjacent” SMEs in AI safety, and probably even more so relative to aggregate opinion without an LW-y filter applied (cf.). My impression of the epistemic track-record is Yudkowsky has a tendency of staking out positions (both within and without AI) with striking levels of confidence but not commensurately-striking levels of accuracy.
In essence, I doubt there’s much epistemic reason to defer to Yudkowsky more (or much more) than folks like Carl Shulman, or Paul Christiano, nor maybe much more than “a random AI alignment researcher” or “a superforecaster making a guess after watching a few Rob Miles videos” (although these have a few implied premises around difficulty curves/ subject matter expertise being relatively uncorrelated to judgemental accuracy).
I suggest ~all reasonable attempts at idealised aggregate wouldn’t take a hand-brake turn to extreme pessimism on finding Yudkowsky is. My impression is the plurality LW view has shifted more from “pretty worried” to “pessimistic” (e.g. p(screwed) > 0.4) rather than agreement with Yudkowsky, but in any case I’d attribute large shifts in this aggregate mostly to Yudkowsky’s cultural influence on the LW-community plus some degree of internet cabin fever (and selection) distorting collective judgement.
None of this is cause for complacency: even if p(screwed) isn’t ~1, > 0.1 (or 0.001) is ample cause for concern, and resolution on values between (say) [0.1 0.9] is informative for many things (like personal career choice). I’m not sure whether you get more yield for marginal effort on object or second-order uncertainty (e.g. my impression is the ‘LW cluster’ trends towards pessimism, so adjudicating whether this cluster should be over/under weighted could be more informative than trying to get up to speed on ELK). I would guess, though, that whatever distils out of LW discourse in 1-2 months will be much more useful than what you’d get right now.
If Yudkowsky is needlessly pessimistic, I guess we get an extra decade of time. How are we going to use it? Ten years later, will we feel just as hopeless as today, and hope that we get another extra decade?
This phrasing bothers me a bit. It presupposes that it is only a matter of time; that there’s no error about the nature of the threat AGI poses, and no order-of-magnitude error in the timeline. The pessimism is basically baked in.
Fair point. We might get an extra century. Until then, it may turn out that we can somehow deal with the problem, for example by having a competent and benevolent world government that can actually prevent the development of superhuman AIs (perhaps by using millions of exactly-human-level AIs who keep each other in check and together endlessly scan all computers on the planet).
I mean, a superhuman AI is definitely going to be a problem of some kind; at least economically and politically. But in best case, we may be able to deal with it. Either because we somehow got more competent quickly, or because we had enough time to become more competent gradually.
Maybe even this is needlessly pessimistic, but in such case I don’t see how it is.
I’m sympathetic to the position you feel you’re in. I’m sorry it’s currently like that.
I think you should be quite convinced by the point you’re taking out loans to study, and that the apparent plurality of the LessWrong commentariat is unlikely to be sufficient evidence to reach that level of convincement – just my feeling.
I’m hoping some more detailed arguments for doom will be posted in the near future and that will help many people reach their own conclusions not based on information cascades, etc.
Lastly, I do think people should be more “creative” in finding ways to boost log odds of survival. Direct research might make sense for some, but if you’d need to go back to the school for it, there are maybe other things you should brainstorm and consider.
Odds are an alternative way of presenting probabilities. 50% corresponds to 1:1, 66.66..% corresponds to 1:2, 90% corresponds to 1:9, etc. 33.33..% correspond to 2:1 odds, or, with the first number as as a 1, 1:0.5 odds.
Log odds, or bits, are the logarithm of probabilities expressed as 1:x odds. In some cases, they can be a more natural way of thinking about probabilities (see e.g., here.)
Well, at least one of us thinks you’re going to die and to maximize your chance to die with dignity you should quit your job, say bollocks to it all, and enjoy the sunshine while you still can!
Don’t look at opinions, look for data and facts. Speculations, opinions or beliefs cannot be the basis on which you take decisions or update your knowledge. It’s better to know few things, but with high confidence.
Ask yourself, which hard data points are there in favour of doom-soon?
Facts and data are of limited use without a paradigm to conceptualize them. If you have some you think are particularly illuminative though by all means share them here.
My main point is that there is not enough evidence for a strong claim like doom-soon. In absence of hard data anybody is free to cook up argument pro or against doom-soon.
You may not like my suggestion, but I would strongly advise to get deeper into the field and understand it better yourself, before taking important decisions.
In terms of paradigms, you may have a look at why building AI-software development is hard (easy to get to 80% accurate, hellish to get to 99%), AI-winters and hype cycles (disconnect between claims-expectations and reality), the development of dangerous technologies (nuclear, biotech) and how stability has been achieved.
As a layperson, the problem has been that my ability to figure out what’s true relies on being able to evaluate subject-matter experts respective reliability on the technical elements of alignment. I’ve lurked in this community a long time; I’ve read the Sequences and watched the Robert Miles videos. I can offer a passing explanation of what the corrigibility problem is, or why ELK might be important.
None of that seems to count for much. Yitz made what I thought was a very lucid post from a similar level of knowledge, trying to bridge that gap, and got mostly answers that didn’t tell me (or as best I can tell, them) anything in concept I wasn’t already aware of, plus Eliezer himself being kind of hostile in response to someone trying to understand.
So here I find myself in the worst of both worlds; the apparent plurality of the LessWrong commentariat says I’m going to die and to maximise my chance to die with dignity I should quit my job and take out a bunch of loans try to turbo through an advanced degree in machine learning, and I don’t have the tools to evaluate whether they’re right.
I agree. I find myself in an epistemic state somewhat like: “I see some good arguments for X. I can’t think of any particular counter-argument that makes me confident that X is false. If X is true, it implies there are high-value ways of spending my time that I am not currently doing. Plenty of smart people I know/read believe X; but plenty do not”
It sounds like that should maybe be enough to coax me into taking action about X. But the problem is that I don’t think it’s that hard to put me in this kind of epistemic state. Eg, if I were to read the right blogs, I think I could be brought into that state for a bunch of different values of X. A few of the top of my head that seem plausible:
Climate change
Monetary policy/hyperinflation
Animal suffering
So I don’t feel super trusting of my epistemic state. I guess I feel a sort of epistemic learned helplessness, where I am suspicious of smart bloggers’ ability to get me to think an issue is important and worth dedicating my life to.
Not totally sure how to resolve this, though I suppose it would involve some sort of “going off on my own and actually thinking deeply about what it should take to convince me”
I feel the same. I think there are just a lot of problems which one could try to solve/solve which are increasing the good in the world. The difference between alignment and the rest seems to be the probability at which humans will go extinct is much higher.
That part really shouldn’t be necessary (even if it may be rational, conditional on some assumptions). In the event that you do decide to devote your time to helping, whether for dignity or whatever else, you should be able to get funding to cover most reasonable forms of upskilling and/or seeing-if-you-can-help trial period.
That said, I think step one would be to figure out where your comparative advantage lies (80,000 hours folk may have thoughts, among others). Certainly some people should be upskilling in ML/CS/Math—though an advanced degree may not be most efficient -, but there are other ways to help.
I realize this doesn’t address the deciding-what’s-true aspect.
I’d note there that I don’t think much detailed ML knowledge is necessary to follow Eliezer’s arguments on this. Most of the ML-dependent parts can be summarized as [we don’t know how to do X], [we don’t have any clear plan that we expect will tell us how to do X], similarly for Y, Z, [Either X, Y or Z is necessary for safe AGI].
Beyond that, I think you only need a low prior on our bumping into a good solution while fumbling in the dark and a low prior on sufficient coordination, and things look quite gloomy. Probably you also need to throw in some pessimism on getting safe AI systems to fundamentally improve our alignment research.
Hi Joe! I wonder if you have any pointers as to how to get help? I would like to try to help while being able to pay for rent and food. I think right now I’m may not br articulate enough to write grant proposals and get funding, so I think I could also use somebody to talk to to figure out what’s the most high impact thing I could do.
I wonder if you’d be willing to chat / know anybody who is?
Something like the 80,000 hours career advice seems like a good place to start—or finding anyone who has a good understanding of the range of possibilities (mine is a bit too narrowly slanted towards technical AIS).
If you’ve decided on the AIS direction, then AI Safety Support is worth a look—they do personal calls for advice, and have many helpful links.
That said, I wouldn’t let the idea of “grant proposals” put you off. The forms you’d need to fill for the LTFF are not particularly complicated, and they do give grants for e.g. upskilling—you don’t necessarily need a highly specific/detailed plan.
If you don’t have a clear idea where you might fit in, then the advice links above should help.
If/when you do have a clear idea, don’t worry about whether you can articulate it persuasively. If it makes sense, then people will be glad to hear it—and to give you pointers (e.g. fund managers).
E.g. there’s this from Evan Hubinger (who helps with the LTFF):
Also worth bearing in mind as a general principle that if almost everything you try succeeds, you’re not trying enough challenging things. Just make sure to take negative outcomes as useful information (often you can ask for specific feedback too). There’s a psychological balance to be struck here, but trying at least a little more than you’re comfortable with will generally expand your comfort zone and widen your options.
Thank you so much! I didn’t know 80k does advising! In terms of people with knowledge on the possibilities… I have a background and a career path that doesn’t end up giving me a lot of access to people who know, so I’ll definitely try to get help at 80k.
This was very encouraging! Thank you.
This is probably pretty tangential to the overall point of your post, but you definitely don’t need to take loans for this, since you could apply for funding from Open Philanthropy’s early-career funding for individuals interested in improving the long-term future or the Long-Term Future Fund.
You don’t have to have a degree in machine learning. Besides machine learning engineering or machine learning research there are plenty of other ways to help reduce existential risk from AI, such as:
software engineering at Redwood Research or Anthropic
independent alignment research
operations for Redwood Research, Encultured AI, Stanford Existential Risks Initiative, etc.
community-building work for a local AI safety group (e.g., at MIT or Oxford)
AI governance research
or something part-time like participating in the EA Cambridge AGI Safety Fundamentals program and then facilitating for it
Personally, my estimate of the probability of doom is much lower than Eliezer’s, but in any case, I think it’s worthwhile to carefully consider how to maximize your positive impact on the world, whether that involves reducing existential risk from AI or not.
I’d second the recommendation for applying for career advising from 80,000 Hours or scheduling a call with AI Safety Support if you’re open to working on AI safety.
I can’t help with the object level determination, but I think you may be overrating both the balance and import of the second-order evidence.
As far as I can tell, Yudkowsky is a (?dramatically) pessimistic outlier among the class of “rationalist/rationalist-adjacent” SMEs in AI safety, and probably even more so relative to aggregate opinion without an LW-y filter applied (cf.). My impression of the epistemic track-record is Yudkowsky has a tendency of staking out positions (both within and without AI) with striking levels of confidence but not commensurately-striking levels of accuracy.
In essence, I doubt there’s much epistemic reason to defer to Yudkowsky more (or much more) than folks like Carl Shulman, or Paul Christiano, nor maybe much more than “a random AI alignment researcher” or “a superforecaster making a guess after watching a few Rob Miles videos” (although these have a few implied premises around difficulty curves/ subject matter expertise being relatively uncorrelated to judgemental accuracy).
I suggest ~all reasonable attempts at idealised aggregate wouldn’t take a hand-brake turn to extreme pessimism on finding Yudkowsky is. My impression is the plurality LW view has shifted more from “pretty worried” to “pessimistic” (e.g. p(screwed) > 0.4) rather than agreement with Yudkowsky, but in any case I’d attribute large shifts in this aggregate mostly to Yudkowsky’s cultural influence on the LW-community plus some degree of internet cabin fever (and selection) distorting collective judgement.
None of this is cause for complacency: even if p(screwed) isn’t ~1, > 0.1 (or 0.001) is ample cause for concern, and resolution on values between (say) [0.1 0.9] is informative for many things (like personal career choice). I’m not sure whether you get more yield for marginal effort on object or second-order uncertainty (e.g. my impression is the ‘LW cluster’ trends towards pessimism, so adjudicating whether this cluster should be over/under weighted could be more informative than trying to get up to speed on ELK). I would guess, though, that whatever distils out of LW discourse in 1-2 months will be much more useful than what you’d get right now.
> the class of “rationalist/rationalist-adjacent” SMEs in AI safety,
What’s an SME?
It is a subject matter expert.
If Yudkowsky is needlessly pessimistic, I guess we get an extra decade of time. How are we going to use it? Ten years later, will we feel just as hopeless as today, and hope that we get another extra decade?
This phrasing bothers me a bit. It presupposes that it is only a matter of time; that there’s no error about the nature of the threat AGI poses, and no order-of-magnitude error in the timeline. The pessimism is basically baked in.
Fair point. We might get an extra century. Until then, it may turn out that we can somehow deal with the problem, for example by having a competent and benevolent world government that can actually prevent the development of superhuman AIs (perhaps by using millions of exactly-human-level AIs who keep each other in check and together endlessly scan all computers on the planet).
I mean, a superhuman AI is definitely going to be a problem of some kind; at least economically and politically. But in best case, we may be able to deal with it. Either because we somehow got more competent quickly, or because we had enough time to become more competent gradually.
Maybe even this is needlessly pessimistic, but in such case I don’t see how it is.
I’m sympathetic to the position you feel you’re in. I’m sorry it’s currently like that.
I think you should be quite convinced by the point you’re taking out loans to study, and that the apparent plurality of the LessWrong commentariat is unlikely to be sufficient evidence to reach that level of convincement – just my feeling.
I’m hoping some more detailed arguments for doom will be posted in the near future and that will help many people reach their own conclusions not based on information cascades, etc.
Lastly, I do think people should be more “creative” in finding ways to boost log odds of survival. Direct research might make sense for some, but if you’d need to go back to the school for it, there are maybe other things you should brainstorm and consider.
Sorry if this is a silly question, but what exactly are “log odds” and what do they mean in this context?
Odds are an alternative way of presenting probabilities. 50% corresponds to 1:1, 66.66..% corresponds to 1:2, 90% corresponds to 1:9, etc. 33.33..% correspond to 2:1 odds, or, with the first number as as a 1, 1:0.5 odds.
Log odds, or bits, are the logarithm of probabilities expressed as 1:x odds. In some cases, they can be a more natural way of thinking about probabilities (see e.g., here.)
I think 75% is 1:3 rather than 1:2.
Whoops, changed
(A confusing way of writing “probability”)
Well, at least one of us thinks you’re going to die and to maximize your chance to die with dignity you should quit your job, say bollocks to it all, and enjoy the sunshine while you still can!
Don’t look at opinions, look for data and facts. Speculations, opinions or beliefs cannot be the basis on which you take decisions or update your knowledge. It’s better to know few things, but with high confidence.
Ask yourself, which hard data points are there in favour of doom-soon?
Facts and data are of limited use without a paradigm to conceptualize them. If you have some you think are particularly illuminative though by all means share them here.
My main point is that there is not enough evidence for a strong claim like doom-soon. In absence of hard data anybody is free to cook up argument pro or against doom-soon.
You may not like my suggestion, but I would strongly advise to get deeper into the field and understand it better yourself, before taking important decisions.
In terms of paradigms, you may have a look at why building AI-software development is hard (easy to get to 80% accurate, hellish to get to 99%), AI-winters and hype cycles (disconnect between claims-expectations and reality), the development of dangerous technologies (nuclear, biotech) and how stability has been achieved.