I think your error is you are thinking the “RL algorithm” is the encoded policy network on a specific creature. Like a human wants to have children, a bacterium wants to find food and replicate itself the moment thresholds are reached. There is a physical mechanism that causes these policies to be enacted.
This is not the RL algorithm. The RL algorithm of evolution doesn’t “exist” anywhere physical, it just happens to prefer outcomes where creatures cause other creatures to exist, they do not have to be remotely sharing the same code. Evolution ranks : build an AI successor >>>>>>>> father 1000 children >> father 1 child, for a concrete example, and it prefers them in that order.
Or another example, you think individual creatures would prefer their own genes to be propagated*. This is a policy. If hypothetically you could go to a biotech clinic and have your genetic code upgraded (junk cleaned out, AI designed genes replace all of your genes with superior versions or the best version found in the human gene pool), your policy network as a human being may not prefer that outcome, but evolution DOES.
Isn’t your list an “any_of”?
I think your error is you are thinking the “RL algorithm” is the encoded policy network on a specific creature. Like a human wants to have children, a bacterium wants to find food and replicate itself the moment thresholds are reached. There is a physical mechanism that causes these policies to be enacted.
This is not the RL algorithm. The RL algorithm of evolution doesn’t “exist” anywhere physical, it just happens to prefer outcomes where creatures cause other creatures to exist, they do not have to be remotely sharing the same code. Evolution ranks : build an AI successor >>>>>>>> father 1000 children >> father 1 child, for a concrete example, and it prefers them in that order.
Or another example, you think individual creatures would prefer their own genes to be propagated*. This is a policy. If hypothetically you could go to a biotech clinic and have your genetic code upgraded (junk cleaned out, AI designed genes replace all of your genes with superior versions or the best version found in the human gene pool), your policy network as a human being may not prefer that outcome, but evolution DOES.
and culture and everything else that @Zvi values.