Your claims about markets seem just wrong to me. Markets generally do what their consumers want, and their failures are largely the result of transaction costs. Some of these transaction costs have to do with information asymmetry (which needs to be solved), but many others that show up in the real world (related to standard problems like negative externalities etc.) can just be removed by construction in virtual markets.
Markets are fundamentally driven by the pursuit of defined rewards or currencies, so in such a system, how do we ensure that the currency being optimized for truly captures what we care about
By having humans be the consumers in the market. Yes, it is possible to “trick” the consumers, but the idea is that if any oversight protocol is possible at all, then the consumers will naturally buy information from there, and AIs will learn to expect this changing reward function.
MIRI has been talking about it for years; the agent foundations group has many serious open problems related to it.
Can you send me a link? The only thing on “markets in an alignment context” I’ve found on this from the MIRI side is the Wentworth-Soares discussion, but that seems like a very different issue.
it can be confidently known now that the design you proposed is catastrophically misaligned
Can you send me a link for where this was confidently shown? This is a very strong claim to make, nobody even makes this claim in the context of backprop.
Your claims about markets seem just wrong to me. Markets generally do what their consumers want, and their failures are largely the result of transaction costs. Some of these transaction costs have to do with information asymmetry (which needs to be solved), but many others that show up in the real world (related to standard problems like negative externalities etc.) can just be removed by construction in virtual markets.
By having humans be the consumers in the market. Yes, it is possible to “trick” the consumers, but the idea is that if any oversight protocol is possible at all, then the consumers will naturally buy information from there, and AIs will learn to expect this changing reward function.
Can you send me a link? The only thing on “markets in an alignment context” I’ve found on this from the MIRI side is the Wentworth-Soares discussion, but that seems like a very different issue.
Can you send me a link for where this was confidently shown? This is a very strong claim to make, nobody even makes this claim in the context of backprop.