Rohin Shah comments on [AN #160]: Building AIs that learn and think like people

Rohin Shah 23 Aug 2021 9:21 UTC
12 points
I am generally skeptical of this as an approach to AI alignment—it feels like you are shooting yourself in the foot by restricting yourself to only those things that could be implemented in capitalism. Capitalism interventions must treat humans as black boxes; AI alignment has no analogous restriction. Some examples of things you can do in AI alignment but not capitalism:
- Inspect weights at a deep enough level that you understand exactly what algorithm they implement (in capitalism, you never know exactly what the individual humans will do and you must design around them)
- Make copies of AIs, e.g. to check whether they use concepts consistently; you can never put humans in exactly the same situation (if nothing else they’ll remember previous situations)
I don’t know of much previous work that is public, but there is Incomplete Contracting and AI Alignment (I don’t think it’s that related though).
The other direction (using AI alignment to improve capitalism) seems more plausible but I have much less knowledge about that.