It seems like this example would in some ways work better if the model organism was mice not bacteria because bacteria probably do not even have values to begin with (so inconsistency isn’t the issue) nor any internal experience.
With say mice though (though perhaps roundworms might work here, since it’s more conceivable that they could actually have preferences) the answer to how to satisfy their values seems almost certainly is just wireheading since they don’t have a complex enough mind to have preferences about the world distinct from just their experiences.
So I’m not sure whether this type of approach works because you probably need more intelligent social animals in order for satisfying their preferences to not just be best achieved through wireheading.
Still I suppose this does raise the question of how one might best satisfy the preferences/values of animals like corvids or primates who lack some of the more complex human values but still share the most basic values like being socially validated (and caring about the mental states of other animals; which rules out experience machine like solutions).
I’m assuming you think wireheading is a disastrous outcome for a super intelligent AI to impose on humans. I’m also assuming you think if bacteria somehow became as intelligent as humans, they would also agree that wireheading would be a disastrous outcome for them, despite the fact that wireheading is probably the best solution that can be done given how unsophisticated their brains are. I.e. the best solution for their simple brains would be considered disastrous by our more complex brains.
This suggests the possibility that maybe the best solution that can be applied to human brains would be considered disastrous for a more complex brain imagining that humans somehow became as intelligent as them.
While I consider wireheading only marginally better than oblivion the more general issue is the extent to which you can really call something alignment if it leads to behavior that the overwhelming majority of people consider egregious and terrible in every way. It really doesn’t make sense to talk to talk about there being a “best” solution here anyway because that basically begs the question with regards to certain moral philosophy.
>I’m also assuming you think if bacteria somehow became as intelligent as humans, they would also agree that wireheading would be a disastrous outcome for them, despite the fact that wireheading is probably the best solution that can be done given how unsophisticated their brains are. I.e. the best solution for their simple brains would be considered disastrous by our more complex brains.
This assumption doesn’t hold and somewhat misses my point entirely. As I talked about in my comment bacteria don’t seem like they meaningfully have thoughts or preferences so the idea of making a super smart bacteria is rather like making a superintelligent rock. I can remove those surface level issues if I just replace say “bacteria” with say “mice” in which case there’s a different misunderstanding involved here.
The main issue here is that it seems like you are massively anthropomorphizing animals. If a species of animal doesn’t have a certain degree of intelligence it’s unlikely to have a value system that actually cares about the external world. However it would be a form of anthropocentrism to expect that an “uplifted” version of an animal would necessarily start gaining certain terminal human values just because it’s smarter.
So my point more generally is that you seem to need (in natural life at least) a degree of intelligence and socialness to both be able to and have evolved a mind design that cares about the external world. So most animals can have their values easily and completely encompassed by wireheading so there’s no reason not to do that to them and that doesn’t really generalize to aligning AI for smarter more social species.
It seems like this example would in some ways work better if the model organism was mice not bacteria because bacteria probably do not even have values to begin with (so inconsistency isn’t the issue) nor any internal experience.
With say mice though (though perhaps roundworms might work here, since it’s more conceivable that they could actually have preferences) the answer to how to satisfy their values seems almost certainly is just wireheading since they don’t have a complex enough mind to have preferences about the world distinct from just their experiences.
So I’m not sure whether this type of approach works because you probably need more intelligent social animals in order for satisfying their preferences to not just be best achieved through wireheading.
Still I suppose this does raise the question of how one might best satisfy the preferences/values of animals like corvids or primates who lack some of the more complex human values but still share the most basic values like being socially validated (and caring about the mental states of other animals; which rules out experience machine like solutions).
I’m assuming you think wireheading is a disastrous outcome for a super intelligent AI to impose on humans. I’m also assuming you think if bacteria somehow became as intelligent as humans, they would also agree that wireheading would be a disastrous outcome for them, despite the fact that wireheading is probably the best solution that can be done given how unsophisticated their brains are. I.e. the best solution for their simple brains would be considered disastrous by our more complex brains.
This suggests the possibility that maybe the best solution that can be applied to human brains would be considered disastrous for a more complex brain imagining that humans somehow became as intelligent as them.
While I consider wireheading only marginally better than oblivion the more general issue is the extent to which you can really call something alignment if it leads to behavior that the overwhelming majority of people consider egregious and terrible in every way. It really doesn’t make sense to talk to talk about there being a “best” solution here anyway because that basically begs the question with regards to certain moral philosophy.
>I’m also assuming you think if bacteria somehow became as intelligent as humans, they would also agree that wireheading would be a disastrous outcome for them, despite the fact that wireheading is probably the best solution that can be done given how unsophisticated their brains are. I.e. the best solution for their simple brains would be considered disastrous by our more complex brains.
This assumption doesn’t hold and somewhat misses my point entirely. As I talked about in my comment bacteria don’t seem like they meaningfully have thoughts or preferences so the idea of making a super smart bacteria is rather like making a superintelligent rock. I can remove those surface level issues if I just replace say “bacteria” with say “mice” in which case there’s a different misunderstanding involved here.
The main issue here is that it seems like you are massively anthropomorphizing animals. If a species of animal doesn’t have a certain degree of intelligence it’s unlikely to have a value system that actually cares about the external world. However it would be a form of anthropocentrism to expect that an “uplifted” version of an animal would necessarily start gaining certain terminal human values just because it’s smarter.
So my point more generally is that you seem to need (in natural life at least) a degree of intelligence and socialness to both be able to and have evolved a mind design that cares about the external world. So most animals can have their values easily and completely encompassed by wireheading so there’s no reason not to do that to them and that doesn’t really generalize to aligning AI for smarter more social species.