TurnTrout comments on TurnTrout’s shortform feed

TurnTrout 30 Sep 2023 1:32 UTC
5 points
It’d be nice if “optimization objective” split out into:
1. Internally represented goal which an intelligent entity optimizes towards (e.g. a person’s desire to get rich),
2. Signal which is used to compute local parameter updates (e.g. SGD),
There are more possible senses of the phrase, but I think these are commonly conflated. EG “The optimization objective of RL is the reward function” should, by default, mean 2) and not 1). But because we use similar words, I think the claims become muddled, and it’s not even clear if/when/who is messing this up or not.
(Think carefully before calling RL policies “agents” relatedly tries to clarify language here.)