Mitchell_Porter comments on Engaging Seriously with Short Timelines

Mitchell_Porter 21 Sep 2020 12:11 UTC
9 points
Hi, for some reason I didn’t see this reply until recently.
metaethical.ai is the most sophisticated sketch I’ve seen, of how to make human-friendly AI. In my personal historiography of “friendliness theory”, the three milestones so far are Yudkowsky 2004 (Coherent Extrapolated Volition), Christiano 2016 (alignment via capability amplification), and June Ku 2019 (“AIXI for Friendliness”).
To me, it’s conceivable that the metaethical.ai schema is sufficient to solve the problem. It is an idealization (“we suppose that unlimited computation and a complete low-level causal model of the world and the adult human brains in it are available”), but surely a bounded version that uses heuristic models can be realized.
- stoat 22 Sep 2020 19:29 UTC
  3 points
  Parent
  Thanks! FWIW your high opinion of the project counts for a lot with me; I will allocate more attention to it and seriously consider donating.