Thomas Kwa comments on Humans aren’t fitness maximizers

Thomas Kwa 4 Oct 2022 22:00 UTC
0 points
−2
The reason why we’re talking about humans and IGF is because there’s an analogy to AGI. If we select on the AI to be corrigible (or whatever nice property) in subhuman domains, will it generalize out-of-distribution to be corrigible when superhuman and performing coherent optimization?
Humans are not generalizing out of distribution. The average woman who wants to raise high quality children does not have the goal of maximizing IGF; she does try to instill the value of maximizing IGF into them, nor use the far more effective strategies of donating eggs, trying to get around egg donation limits, or getting her male relatives to donate sperm.
If the environment stabilizes, additional selection pressure might cause these people to become a majority. But we might not have additional selection pressure in the AGI case.
- the gears to ascension 4 Oct 2022 22:19 UTC
  3 points
  2
  Parent
  getting around egg donation limits is a defect strategy; my argument is, this seems like you’re really asking why we’re not generalizing into defecting in the societal IGF game. we don’t want to maximize first derivative of IGF if we want to plan millennia ahead for deep time reproduction rate—instead, we need to maximize group survival. that’s what is generally true in all religions, not just the high-defect “have lots of kids, so many you’re only barely qualifying K selected” religious bubbles of heavy reproduction.
  
  to generalize this to agi, we need every agent to have a map of other agents’ empowerment, and seek to ensure all agents remain empowered, at the expense of some empowerment limits for agents that want to take unfair proportions of the universe’s empowerment.
  
  I really think inclusive [memetic+genetic] fitness against a universal information empowerment objective (one that is intractable to evaluate) has something mathematical to say here, and I’m frustrated that I don’t seem to know how to put the math. it seems obvious and simple, such that anyone who had studied the field would know what it is I’m looking for with my informal speech; perhaps I it’s not should go study the fields I’m excited about more.
  
  but it really seems like unfriendly foom is “ai decides we suck, can be beaten in the societal inclusive phenotype fitness game, and ~breeds a lot, maybe after killing us first, without any care for the loss of our [genetic+memetic] patterns’ fitness”.
  
  and given that we think our genetic and memetic patterns are what are keeping us alive to pass on to the next generation, I ask again—why do we not look like semi-time-invariant IGMF fitness maximizers? we are the evolutionary process happening and we always have been. shouldn’t we even be so sure we’re IGMF maximizers that we should ask why IGMF maximizers look like us right now? like, this is the objective for evolution, shouldn’t we be doing interpretability on its output rather than fretting that we don’t obey it?