Interesting. I can’t recall if I commented on the alignment as translation post about this, but I think this is in fact the key thing standing in the way of addressing alignment, and put together a formal model that identified this as the problem, i.e. how do you ensure that two minds agree about preference ordering, or really even the statements being ordered.
Interesting. I can’t recall if I commented on the alignment as translation post about this, but I think this is in fact the key thing standing in the way of addressing alignment, and put together a formal model that identified this as the problem, i.e. how do you ensure that two minds agree about preference ordering, or really even the statements being ordered.