feels like it’s setting up weak-men on an issue where I disagree with you, but in a way that’s particularly hard to engage with
My best guess as to why it might feel like this is that you think I’m laying groundwork for some argument of the form “P(doom) is very high”, which you want to nip in the bud, but are having trouble nipping in the bud here because I’m building a motte (“cosmopolitan values don’t come free”) that I’ll later use to defend a bailey (“cosmopolitan values don’t come cheap”).
This misunderstands me (as is a separate claim from the claim “and you’re definitely implying this”).
The impetus for this post is all the cases where I argue “we need to align AI” and people retort with “But why do you want it to have our values instead of some other values? What makes the things that humans care about so great? Why are you so biased towards values that you personally can understand?”. Where my guess is that many of those objections come from a place of buying into broad cosmopolitan value much more than any particular local human desire.
And all I’m trying to do is say here is that I’m on board with buying into broad cosmopolitan value more than any particular local human desire, and I still think we’re in trouble (by default).
I’m not trying to play 4D chess here, I’m just trying to get some literal basic obvious stuff down on (e-)paper, in short posts that don’t have a whole ton of dependencies.
Separately, treating your suggestions as if they were questions that you were asking for answers to:
I’ve recently seen this argument pop up in-person with econ folk, crypto folk, and longevity folk, and have also seen it appear on twitter.
I’m not really writing with an “intendend audience” in mind; I’m just trying to get the basics down, somewhere concise and with few dependencies. The closest thing to an “intended audience” might be the ability to reference this post by link or name in the future, when I encounter the argument again. (Or perhaps it’s “whatever distribution the econ/crypto/longevity/twitter people are drawn from, insofar as some of them have eyes on LW these days”.)
If you want more info about this, maybe try googling “fragility of value lesswrong”, or “metaethics sequence lesswrong”. Earth doesn’t really have good tools for aggregating arguments and justifications at this level of specificity, so if you want better and more localized links than that then you’ll probably need to develop more civilizational infrastructure first.
My epistemic status on this is “obvious-once-pointed-out”; my causal reason for believing it was that it was pointed out to me (e.g. in the LessWrong sequences); I think Eliezer’s arguments are basically just correct.
Separately, I hereby push back against the idea that posts like this should put significant effort into laying out the justifications (as is not necessarily what you’re advocating). I agree that there’s value in that; I think it leads to something like the LessWrong sequences (which I think were great); and I think that what we need more of on the margin right now is people laying out the most basic positions without fluff.
That said, I agree that the post would be stronger with a link to a place where lots of justifications have been laid out (despite being justifications for slightly different points, and being intertwined with justifications for wholly different points, as is just how things look in a civilization that doesn’t have good infrastructure for centralizing arguments in the way that wikipedia is a civilizational architecture for centralizing settled facts), and so I’ve edited in a link.
My best guess as to why it might feel like this is that you think I’m laying groundwork for some argument of the form “P(doom) is very high”, which you want to nip in the bud, but are having trouble nipping in the bud here because I’m building a motte (“cosmopolitan values don’t come free”) that I’ll later use to defend a bailey (“cosmopolitan values don’t come cheap”).
I expect that you personally won’t do a motte-and-bailey here (except perhaps insofar as you later draw on posts like these as evidence that the doomer view has been laid out in a lot of different places, when this isn’t in fact the part of the doomer view relevant to ongoing debates in the field).
But I do think that the “free vs cheap” distinction will obscure more than it clarifies, because there is only an epsilon difference between them; and because I expect a mob-and-bailey where many people cite the claim that “cosmopolitan values don’t come free” as evidence in debates that should properly be about whether cosmopolitan values come cheap. This is how weak men work in general.
Versions of this post that I wouldn’t object to in this way include:
A version which is mainly framed as a conceptual distinction rather than an empirical claim
A version which says upfront “this post is not relevant to most informed debates about alignment, it’s instead intended to be relevant in the following context:”
A version which identifies that there’s a different but similar-sounding debate which is actually being held between people informed about the field, and says true things about the positions of your opponents in that debate and how they are different from the extreme caricatures in this post
My best guess as to why it might feel like this is that you think I’m laying groundwork for some argument of the form “P(doom) is very high”, which you want to nip in the bud, but are having trouble nipping in the bud here because I’m building a motte (“cosmopolitan values don’t come free”) that I’ll later use to defend a bailey (“cosmopolitan values don’t come cheap”).
This misunderstands me (as is a separate claim from the claim “and you’re definitely implying this”).
The impetus for this post is all the cases where I argue “we need to align AI” and people retort with “But why do you want it to have our values instead of some other values? What makes the things that humans care about so great? Why are you so biased towards values that you personally can understand?”. Where my guess is that many of those objections come from a place of buying into broad cosmopolitan value much more than any particular local human desire.
And all I’m trying to do is say here is that I’m on board with buying into broad cosmopolitan value more than any particular local human desire, and I still think we’re in trouble (by default).
I’m not trying to play 4D chess here, I’m just trying to get some literal basic obvious stuff down on (e-)paper, in short posts that don’t have a whole ton of dependencies.
Separately, treating your suggestions as if they were questions that you were asking for answers to:
I’ve recently seen this argument pop up in-person with econ folk, crypto folk, and longevity folk, and have also seen it appear on twitter.
I’m not really writing with an “intendend audience” in mind; I’m just trying to get the basics down, somewhere concise and with few dependencies. The closest thing to an “intended audience” might be the ability to reference this post by link or name in the future, when I encounter the argument again. (Or perhaps it’s “whatever distribution the econ/crypto/longevity/twitter people are drawn from, insofar as some of them have eyes on LW these days”.)
If you want more info about this, maybe try googling “fragility of value lesswrong”, or “metaethics sequence lesswrong”. Earth doesn’t really have good tools for aggregating arguments and justifications at this level of specificity, so if you want better and more localized links than that then you’ll probably need to develop more civilizational infrastructure first.
My epistemic status on this is “obvious-once-pointed-out”; my causal reason for believing it was that it was pointed out to me (e.g. in the LessWrong sequences); I think Eliezer’s arguments are basically just correct.
Separately, I hereby push back against the idea that posts like this should put significant effort into laying out the justifications (as is not necessarily what you’re advocating). I agree that there’s value in that; I think it leads to something like the LessWrong sequences (which I think were great); and I think that what we need more of on the margin right now is people laying out the most basic positions without fluff.
That said, I agree that the post would be stronger with a link to a place where lots of justifications have been laid out (despite being justifications for slightly different points, and being intertwined with justifications for wholly different points, as is just how things look in a civilization that doesn’t have good infrastructure for centralizing arguments in the way that wikipedia is a civilizational architecture for centralizing settled facts), and so I’ve edited in a link.
I expect that you personally won’t do a motte-and-bailey here (except perhaps insofar as you later draw on posts like these as evidence that the doomer view has been laid out in a lot of different places, when this isn’t in fact the part of the doomer view relevant to ongoing debates in the field).
But I do think that the “free vs cheap” distinction will obscure more than it clarifies, because there is only an epsilon difference between them; and because I expect a mob-and-bailey where many people cite the claim that “cosmopolitan values don’t come free” as evidence in debates that should properly be about whether cosmopolitan values come cheap. This is how weak men work in general.
Versions of this post that I wouldn’t object to in this way include:
A version which is mainly framed as a conceptual distinction rather than an empirical claim
A version which says upfront “this post is not relevant to most informed debates about alignment, it’s instead intended to be relevant in the following context:”
A version which identifies that there’s a different but similar-sounding debate which is actually being held between people informed about the field, and says true things about the positions of your opponents in that debate and how they are different from the extreme caricatures in this post