paulfchristiano comments on Notes from a conversation on act-based and goal-directed systems

paulfchristiano 2 Mar 2016 18:22 UTC
0 points
AF
I agree that you almost certainly can’t get an optimal predictor. For similar reasons, you can’t train a supervised learner using any obvious approach. This is the reason that I am pessimistic about this kind of “abstract” goal.

That said, I’m not as pessimistic as you are.

Suppose that I define a very elaborate reflective process, which would be prohibitively complex to simulate and whose behavior is probably not constrained in any meaningful way by any short proofs.

I think that a human can in fact try to maximize the output of such a reflective process, “to the best of their abilities.” And this seems good enough for value alignment.

It’s not important that we actually achieve optimality except on shorter-term instrumentally important problems such as gathering resources (for which we can in fact expect the abstractly motivated algorithm to converge to optimality).