As I understand it—MCTS is used to maximize a given computable utility function, and so it is non alignment-preserving in the general sense that a sufficiently strong optimization of a non-perfect utility function is non alignment-preserving.
Current theme: default
Less Wrong (text)
Less Wrong (link)
As I understand it—MCTS is used to maximize a given computable utility function, and so it is non alignment-preserving in the general sense that a sufficiently strong optimization of a non-perfect utility function is non alignment-preserving.