If it helps, I have some discussion on this topic here (Section 8.3 and especially 8.3.3.1).
This is a nice post and I was mostly nodding along.
I expect it’s moot because of the training competitiveness issue.
I also happen to believe that this evolutionary scenario would only count as “success” if we have a very very low bar for what constitutes successful alignment (e.g. “not worse than a hot-tempered psycho human who grew up on an alien planet”), and if we have that low a bar for “success”, then I’m actually pretty optimistic about our prospects for non-evolutionary alignment “success”.
I also think I’m less optimistic than you about the simulated evolved aliens creating unaligned AGIs (and/or blowing each other to smithereens in other ways). Your Section 5 arguments are not convincing to me because (1) this could happen after they break out of the simulation into the real world, (2) competition could favor AGIs that lack social instincts and other things that make for a good life worth living, and if so, it doesn’t matter whether they build such AGIs from scratch or self-modify into them. Or something like that, I guess.
If it helps, I have some discussion on this topic here (Section 8.3 and especially 8.3.3.1).
This is a nice post and I was mostly nodding along.
I expect it’s moot because of the training competitiveness issue.
I also happen to believe that this evolutionary scenario would only count as “success” if we have a very very low bar for what constitutes successful alignment (e.g. “not worse than a hot-tempered psycho human who grew up on an alien planet”), and if we have that low a bar for “success”, then I’m actually pretty optimistic about our prospects for non-evolutionary alignment “success”.
I also think I’m less optimistic than you about the simulated evolved aliens creating unaligned AGIs (and/or blowing each other to smithereens in other ways). Your Section 5 arguments are not convincing to me because (1) this could happen after they break out of the simulation into the real world, (2) competition could favor AGIs that lack social instincts and other things that make for a good life worth living, and if so, it doesn’t matter whether they build such AGIs from scratch or self-modify into them. Or something like that, I guess.