I agree this is a big blindspot. My take on the intellectual history here is that(crudely speaking) MIRI et al. have mostly pursued a ‘top-down’ approach to agency, starting with agents such as AIXI representing the limit of unbounded rationality and compute, and then attempted to ‘downsize’ them such that they can actually fit in our universe(e.g. logical inductors merely need ridiculously large amounts of compute, rather than hypercomputers). This seems like a reasonable strategy a priori; there’s already a well-developed theory of idealized rationality in agents that you can start with and try to ‘perturb’ down to fit in the actual universe, and it’s plausible that a superintelligence will bear a closer resemblance to such agents than amoebae. The ‘amoeba-first’ strategy is difficult in that a naïve approach will just lead you to learn a bunch of irrelevant details about amoebae, not generalizing usefully to higher intelligences; a large part of the problem consists in figuring out what about amoebae(or whatever other system) you actually want to study, which is somewhat nebulous in contrast to the idealized-agents-first approach. Nevertheless, it does seem that the idealized agents plan has stalled out to a certain degree in recent years, and MIRI(e.g. finite factored set stuff) and other alignment researchers(e.g. johnswentworth’s natural abstraction stuff) have shifted more towards the amoeba side of things. I think the ‘amoeba approach’ has some big advantages in that you can more readily test your ideas or get new ones by examining natural systems, plus physics seems to be the only part of the universe that really cleanly obeys mathematical laws, so a concept of agency starting from physics seems more likely to generalize to arbitrarily powerful intelligences.
I agree this is a big blindspot. My take on the intellectual history here is that(crudely speaking) MIRI et al. have mostly pursued a ‘top-down’ approach to agency, starting with agents such as AIXI representing the limit of unbounded rationality and compute, and then attempted to ‘downsize’ them such that they can actually fit in our universe(e.g. logical inductors merely need ridiculously large amounts of compute, rather than hypercomputers). This seems like a reasonable strategy a priori; there’s already a well-developed theory of idealized rationality in agents that you can start with and try to ‘perturb’ down to fit in the actual universe, and it’s plausible that a superintelligence will bear a closer resemblance to such agents than amoebae. The ‘amoeba-first’ strategy is difficult in that a naïve approach will just lead you to learn a bunch of irrelevant details about amoebae, not generalizing usefully to higher intelligences; a large part of the problem consists in figuring out what about amoebae(or whatever other system) you actually want to study, which is somewhat nebulous in contrast to the idealized-agents-first approach. Nevertheless, it does seem that the idealized agents plan has stalled out to a certain degree in recent years, and MIRI(e.g. finite factored set stuff) and other alignment researchers(e.g. johnswentworth’s natural abstraction stuff) have shifted more towards the amoeba side of things. I think the ‘amoeba approach’ has some big advantages in that you can more readily test your ideas or get new ones by examining natural systems, plus physics seems to be the only part of the universe that really cleanly obeys mathematical laws, so a concept of agency starting from physics seems more likely to generalize to arbitrarily powerful intelligences.