Stuart_Armstrong comments on ALBA: can you be “aligned” at increased “capacity”?

Stuart_Armstrong 17 Apr 2017 14:49 UTC
0 points
AF

It seems like that can be cleanly factored out of the kind of reward engineering I’m discussing in the ALBA post. Does that seem right?

That doesn’t seem right to me. If there isn’t a problem with the reward function, then ALBA seems unnecessarily complicated. Conversely, if there is a problem, we might be able to use something like ALBA to try and fix it (this is why I was more positive about it in practice).