that is able to give “better” feedback than the human could – feedback that considers more consequences more accurately in more complex decision situations, that has spent more effort introspecting
This is roughly how I would run ALBA in practice, and why I said it was better in practice than in theory. I’d be working with considerations I mentioned in this post and try and formalise how to extend utilities/rewards to new settings.
This is roughly how I would run ALBA in practice, and why I said it was better in practice than in theory. I’d be working with considerations I mentioned in this post and try and formalise how to extend utilities/rewards to new settings.
If I read Paul’s post correctly, ALBA is supposed to do this in theory—I don’t understand the theory/practice distinction you’re making.
I disagree. I’m arguing that the concept of “aligned at a certain capacity” makes little sense, and this is key to ALBA in theory.