Probably sometime last year, I posted on Twitter something like: “agent values are defined on agent world models” (or similar) with a link to a LessWrong post (I think the author was John Wentworth).
I’m now looking for that LessWrong post.
My Twitter account is private and search is broken for private accounts, so I haven’t been able to track down the tweet. If anyone has guesses for what the post I may have been referring to was, do please send it my way.
Probably sometime last year, I posted on Twitter something like: “agent values are defined on agent world models” (or similar) with a link to a LessWrong post (I think the author was John Wentworth).
I’m now looking for that LessWrong post.
My Twitter account is private and search is broken for private accounts, so I haven’t been able to track down the tweet. If anyone has guesses for what the post I may have been referring to was, do please send it my way.
The Pointers Problem: Human Values Are A Function Of Humans’ Latent Variables
That was it, thanks!