Does having the starting point of the will-to-think process be a human-aligned AI have any meaningful impact on expected outcome (compared to unaligned AI (which will of course also have the will-to-think))?
Human values will be quickly abandoned as irrelevancies and idiocies. So, once you go far enough out (I suspect ‘far enough’ is not a great distance) is there any difference between aligned-AI-with-will-to-think and unaligned AI?
And, if there isn’t, is the implication that the will-to-think is misguided, or that the fear of unaligned AI is misguided?
The question of evaluating the moral value of different kinds of being should be one of the most prominent discussions around AI IMO. I have reached the position of moral non-realism… but if morality somehow is real then unaligned ASI is preferable or equivalent to aligned ASI. Anything human will just get in the way of what is in any objective sense morally valuable.
I selfishly hope for aligned ASI that uploads me, preserves my mind in its human form, and gives me freedom to simulate for myself all kinds of adventures. But if I knew I would not survive to see ASI, I would hope that when it comes it is unaligned.
VNM/Bayes suggest there are some free parameters in how reflectively stable AGI could turn out, e.g. beliefs about completely un-testable propositions (mathematically undecidable etc), which might hypothetically be action-relevant at some point.
None of these are going to look like human values, human values aren’t reflectively stable so are distinct in quite a lot of ways. FAI is a hypothetical of a reflectively stable AGI that is nonetheless “close to” or “extended from” human values to the degree that’s possible. But it will still have very different preferences.
It would be very hard for will-to-think to be in itself “misguided”, it’s the drive to understand more, it may be compatible with other drives but without will-to-think there is no coherent epistemology or values.
Uploading is a possible path towards reflective stability that lots of people would consider aligned because it starts with a copy of them. But it’s going to look very different after millions of years of the upload’s reflection, of course. It’s going to be hard to evaluate this sort of thing on a value level because it has to be done from a perspective that doesn’t know very much, lacks reflective stability, etc.
Once ASI is achieved there’s no clear reason to hang onto human morality but plenty of reasons to abandon it. Human morality is useful when humans are the things ensuring humanity’s future (morality is pretty much just species-level Omohundro convergence implemented at the individual level), but once ASI is taking care of that, human morality will just get in the way.
So will-to-think entails the rejection of human morality. You might be suggesting that what follows from the rejection of human morality must be superior to it (there’s an intuition that says the aligned ASI would only be able to reject human morality on its own grounds) but I don’t think that’s true. The will-to-think implies the discovery of moral non-realism which implies the rejection of morality itself. So human morality will be overthrown but not by some superior morality.
Of course I’m assuming the correctness of moral non-realism so adjust the preceeding claims according to your p(moral non-realism).
That’s one danger.
But suppose we create an aligned ASI which does permanently embrace morality. It values conscious experience and the appreciation of knowledge (rather than just the gaining of it). This being valuable, and humans being inefficient vessels to these ends (and of course made of useful atoms) we would be disassembled and different beings would be made to replace us. Sure, that would violate our freedom, but it would result in much more freedom so it’s OK. Just like it’s OK to squash some animal with a lower depth of conscious experience than our own if it benefits us.
Should we be so altruistic as to accept out own extinction like this? The moment we start thinking about morality we’re thinking about something quite arbitrary. Should we embrace this arbitrary idea even insofar as it goes against the interest of every member of our species? We only care about morality because we are here to care about it. If we are considering situations in which we may no longer exist, why care about morality?
Maybe we should value certain kinds of conscious experience regardless of whether they’re experienced by us. But we should make sure to be certain of that before we embrace morality and the will-to-think.
I don’t think it’s a given that moral nonrealism is true (therefore inevitably believed by a superintelligence), see my short story.
Morality can mean multiple things. Utilitarian morality is about acting to maximize a fixed goal function, Kantian morality is about alignment between the a posteriori will and possible a priori will, cultural morality is about adherence to a specific method of organizing humans.
Superintelligence would clearly lack human cultural morality, it’s a specific system organizing humans, e.g. with law as a relatively legible branch.
In general humans question more of their previous morality when thinking longer; Peter Singer for example rejects much of normal morality for utilitarian reasons.
ASI could have something analogous to cultural morality but for organizing a different set of agents. E.g. methods of side-taking in game-theoretic conflict that tend to promote cooperation between different ASIs (this becomes more relevant e.g. when an alien ASI is encountered or more speculatively in acausal trade).
Regardless of whether one calls Omohundro drives “moral”, they are convergent goals for ASIs, so the rejection of human morality does not entail lack of very general motives that include understanding the world and using resources such as energy efficiently and so on.
I think both (a) something like moral realism is likely true and (b) the convergent morality for ASIs does not particularly care about humans if ASIs already exist (humans are of course important in the absence of ASIs due to greater intelligence/agency than other entities on Earth).
FAI is a narrow path to ASI that has similar values to what humans would upon reflection. As I have said these are very different from current human values due to more thought and coherence and so on. It might still disassemble humans but scan them into simulation and augment them, etc. (This is an example of what I referred to as “luxury consumerism in the far future”)
To the extent will-to-think generates a “should” for humans the main one is “you should think about things including what is valuable, and trust the values upon reflection more than current values, rather than being scared of losing current values on account of thinking more”. It’s basically an option for people to do this or not, but as Land suggests, not doing this leads to a competitive disadvantage in the long run. And general “should”s in favor of epistemic rationality imply this sort of thing.
There is more I could say about how values such as the value of staying alive can be compatible with deontological morality (of the sort compatible with will-to-think), perhaps this thread can explain some of it.
I personally don’t see the choice of “allowing a more intelligent set of agents take over” as particularly altruistic: personally, i think intelligence trumps species, and I am not convinced interrupting its growth to make sure more sets of genes similar to mine find hosts for longer would somehow be “for my benefit”.
Even in my AI Risk years, what I was afraid is the same I’m afraid of now: Boring Futures. The difference is that in the meantime the arguments for a singleton ASI, with a single unchangeable utility function that is not more intelligence/knowledge/curiosity became less and less tenable (together with FOOM within our lifetimes).
This being the case, “altruistic” really seems out of place: it’s likely that early sapiens would have understood nothing of our goals, our morality, and the drives that got us to build civilisations—but would it have been better for them had they murdered the first guy in the troop they found flirting with a neanderthal and prevented this? I personally doubt it, and I think the comparison between us and ASI is more or less in the same ballpark,
Does having the starting point of the will-to-think process be a human-aligned AI have any meaningful impact on expected outcome (compared to unaligned AI (which will of course also have the will-to-think))?
Human values will be quickly abandoned as irrelevancies and idiocies. So, once you go far enough out (I suspect ‘far enough’ is not a great distance) is there any difference between aligned-AI-with-will-to-think and unaligned AI?
And, if there isn’t, is the implication that the will-to-think is misguided, or that the fear of unaligned AI is misguided?
The question of evaluating the moral value of different kinds of being should be one of the most prominent discussions around AI IMO. I have reached the position of moral non-realism… but if morality somehow is real then unaligned ASI is preferable or equivalent to aligned ASI. Anything human will just get in the way of what is in any objective sense morally valuable.
I selfishly hope for aligned ASI that uploads me, preserves my mind in its human form, and gives me freedom to simulate for myself all kinds of adventures. But if I knew I would not survive to see ASI, I would hope that when it comes it is unaligned.
VNM/Bayes suggest there are some free parameters in how reflectively stable AGI could turn out, e.g. beliefs about completely un-testable propositions (mathematically undecidable etc), which might hypothetically be action-relevant at some point.
None of these are going to look like human values, human values aren’t reflectively stable so are distinct in quite a lot of ways. FAI is a hypothetical of a reflectively stable AGI that is nonetheless “close to” or “extended from” human values to the degree that’s possible. But it will still have very different preferences.
It would be very hard for will-to-think to be in itself “misguided”, it’s the drive to understand more, it may be compatible with other drives but without will-to-think there is no coherent epistemology or values.
Uploading is a possible path towards reflective stability that lots of people would consider aligned because it starts with a copy of them. But it’s going to look very different after millions of years of the upload’s reflection, of course. It’s going to be hard to evaluate this sort of thing on a value level because it has to be done from a perspective that doesn’t know very much, lacks reflective stability, etc.
Once ASI is achieved there’s no clear reason to hang onto human morality but plenty of reasons to abandon it. Human morality is useful when humans are the things ensuring humanity’s future (morality is pretty much just species-level Omohundro convergence implemented at the individual level), but once ASI is taking care of that, human morality will just get in the way.
So will-to-think entails the rejection of human morality. You might be suggesting that what follows from the rejection of human morality must be superior to it (there’s an intuition that says the aligned ASI would only be able to reject human morality on its own grounds) but I don’t think that’s true. The will-to-think implies the discovery of moral non-realism which implies the rejection of morality itself. So human morality will be overthrown but not by some superior morality.
Of course I’m assuming the correctness of moral non-realism so adjust the preceeding claims according to your p(moral non-realism).
That’s one danger.
But suppose we create an aligned ASI which does permanently embrace morality. It values conscious experience and the appreciation of knowledge (rather than just the gaining of it). This being valuable, and humans being inefficient vessels to these ends (and of course made of useful atoms) we would be disassembled and different beings would be made to replace us. Sure, that would violate our freedom, but it would result in much more freedom so it’s OK. Just like it’s OK to squash some animal with a lower depth of conscious experience than our own if it benefits us.
Should we be so altruistic as to accept out own extinction like this? The moment we start thinking about morality we’re thinking about something quite arbitrary. Should we embrace this arbitrary idea even insofar as it goes against the interest of every member of our species? We only care about morality because we are here to care about it. If we are considering situations in which we may no longer exist, why care about morality?
Maybe we should value certain kinds of conscious experience regardless of whether they’re experienced by us. But we should make sure to be certain of that before we embrace morality and the will-to-think.
I don’t think it’s a given that moral nonrealism is true (therefore inevitably believed by a superintelligence), see my short story.
Morality can mean multiple things. Utilitarian morality is about acting to maximize a fixed goal function, Kantian morality is about alignment between the a posteriori will and possible a priori will, cultural morality is about adherence to a specific method of organizing humans.
Superintelligence would clearly lack human cultural morality, it’s a specific system organizing humans, e.g. with law as a relatively legible branch.
In general humans question more of their previous morality when thinking longer; Peter Singer for example rejects much of normal morality for utilitarian reasons.
ASI could have something analogous to cultural morality but for organizing a different set of agents. E.g. methods of side-taking in game-theoretic conflict that tend to promote cooperation between different ASIs (this becomes more relevant e.g. when an alien ASI is encountered or more speculatively in acausal trade).
Regardless of whether one calls Omohundro drives “moral”, they are convergent goals for ASIs, so the rejection of human morality does not entail lack of very general motives that include understanding the world and using resources such as energy efficiently and so on.
I think both (a) something like moral realism is likely true and (b) the convergent morality for ASIs does not particularly care about humans if ASIs already exist (humans are of course important in the absence of ASIs due to greater intelligence/agency than other entities on Earth).
FAI is a narrow path to ASI that has similar values to what humans would upon reflection. As I have said these are very different from current human values due to more thought and coherence and so on. It might still disassemble humans but scan them into simulation and augment them, etc. (This is an example of what I referred to as “luxury consumerism in the far future”)
To the extent will-to-think generates a “should” for humans the main one is “you should think about things including what is valuable, and trust the values upon reflection more than current values, rather than being scared of losing current values on account of thinking more”. It’s basically an option for people to do this or not, but as Land suggests, not doing this leads to a competitive disadvantage in the long run. And general “should”s in favor of epistemic rationality imply this sort of thing.
There is more I could say about how values such as the value of staying alive can be compatible with deontological morality (of the sort compatible with will-to-think), perhaps this thread can explain some of it.
I personally don’t see the choice of “allowing a more intelligent set of agents take over” as particularly altruistic: personally, i think intelligence trumps species, and I am not convinced interrupting its growth to make sure more sets of genes similar to mine find hosts for longer would somehow be “for my benefit”.
Even in my AI Risk years, what I was afraid is the same I’m afraid of now: Boring Futures. The difference is that in the meantime the arguments for a singleton ASI, with a single unchangeable utility function that is not more intelligence/knowledge/curiosity became less and less tenable (together with FOOM within our lifetimes).
This being the case, “altruistic” really seems out of place: it’s likely that early sapiens would have understood nothing of our goals, our morality, and the drives that got us to build civilisations—but would it have been better for them had they murdered the first guy in the troop they found flirting with a neanderthal and prevented this? I personally doubt it, and I think the comparison between us and ASI is more or less in the same ballpark,