I find the idea of the AI Objectives Institute really interesting. I’ve read their website and watched their kick-off call and would be interested how promising people in the AI Safety space think the general approach is, how much we might be able to learn from it, and how much solutions to the AI alignment problem will resemble a competently regulated competitive market between increasingly extremely competent companies.
I’d really appreciate pointers to previous discussions and papers on this topic, too.
I am generally skeptical of this as an approach to AI alignment—it feels like you are shooting yourself in the foot by restricting yourself to only those things that could be implemented in capitalism. Capitalism interventions must treat humans as black boxes; AI alignment has no analogous restriction. Some examples of things you can do in AI alignment but not capitalism:
I find the idea of the AI Objectives Institute really interesting. I’ve read their website and watched their kick-off call and would be interested how promising people in the AI Safety space think the general approach is, how much we might be able to learn from it, and how much solutions to the AI alignment problem will resemble a competently regulated competitive market between increasingly extremely competent companies.
I’d really appreciate pointers to previous discussions and papers on this topic, too.
I am generally skeptical of this as an approach to AI alignment—it feels like you are shooting yourself in the foot by restricting yourself to only those things that could be implemented in capitalism. Capitalism interventions must treat humans as black boxes; AI alignment has no analogous restriction. Some examples of things you can do in AI alignment but not capitalism:
Inspect weights at a deep enough level that you understand exactly what algorithm they implement (in capitalism, you never know exactly what the individual humans will do and you must design around them)
Make copies of AIs, e.g. to check whether they use concepts consistently; you can never put humans in exactly the same situation (if nothing else they’ll remember previous situations)
I don’t know of much previous work that is public, but there is Incomplete Contracting and AI Alignment (I don’t think it’s that related though).
The other direction (using AI alignment to improve capitalism) seems more plausible but I have much less knowledge about that.