I understand where you’re coming from, and I think that you correctly highlight a potential source of concern, and one which my comment didn’t adequately account for. However:
I’m skeptical that it’s possible to create an AI based on mathematical logic at all. Even if an AI with many interacting submodules is dangerous, it doesn’t follow that working on AI safety for an AI based on mathematical logic is promising.
Humans can impose selective pressures on emergent AI’s so as to mimic the process of natural selection that humans experienced.
I’m skeptical that it’s possible to create an AI based on mathematical logic at all. Even if an AI with many interacting submodules is dangerous, it doesn’t follow that working on AI safety for an AI based on mathematical logic is promising.
Eliezer’s position is that the default mode for an AGI is failure; i.e. if an AGI is not provably safe, it will almost certainly go badly wrong. In that contest, if you accept that “an AI with many interacting submodules is dangerous,” that that’s more or less equivalent to believing that one of the horribly wrong outcomes will almost certainly be achieved if an AGI with many submodules is created.
Humans can impose selective pressures on emergent AI’s so as to mimic the process of natural selection that humans experienced.
Humans are not Friendly. They don’t even have the capability under discussion here, to preserve their values under self-modification; a human-esque singleton would likely be a horrible, horrible disaster.
I understand where you’re coming from, and I think that you correctly highlight a potential source of concern, and one which my comment didn’t adequately account for. However:
I’m skeptical that it’s possible to create an AI based on mathematical logic at all. Even if an AI with many interacting submodules is dangerous, it doesn’t follow that working on AI safety for an AI based on mathematical logic is promising.
Humans can impose selective pressures on emergent AI’s so as to mimic the process of natural selection that humans experienced.
Eliezer’s position is that the default mode for an AGI is failure; i.e. if an AGI is not provably safe, it will almost certainly go badly wrong. In that contest, if you accept that “an AI with many interacting submodules is dangerous,” that that’s more or less equivalent to believing that one of the horribly wrong outcomes will almost certainly be achieved if an AGI with many submodules is created.
Humans are not Friendly. They don’t even have the capability under discussion here, to preserve their values under self-modification; a human-esque singleton would likely be a horrible, horrible disaster.