Just a tweet I saw:
Yann LeCun
Doomers: OMG, if a machine is designed to maximize utility, it will inevitably diverge
Engineers: calm down, dude. We only design machines that minimize costs. Cost functions have a lower bound at zero. Minimizing costs canāt cause divergence unless youāre really stupid.
Some commentary:
I think Yann LeCun is being misleading here. While people intuitively think maximization and minimization are different, the real distinction is between convex optimization (where e.g. every local optimum is a global optimum) and non-convex optimization. When dealing with AGI, typically what people hope to solve is non-convex optimization.
Translating back to practical matters, you are presumably going to end up with some cost functions where you donāt reach the lower point of zero, just because there are some desirable outcomes that require tradeoffs or have resource limitations or similar. If you backchain these costs through the causal structure of the real world, that gives you instrumental convergence for standard reasons, just as you get when backchaining utilities.
Very many things wrong with all of that:
RL algorithms donāt minimize costs, but maximize expected reward, which can well be unbounded, so itās wrong to say that the ML field only minimizes cost.
LLMs minimize expected log probability of correct token, which is indeed bounded at zero from below, but achieving zero in that case means perfectly predicting every single token on the internet.
The boundedness of the thing youāre minimizing is totally irrelevant, since maximizing f(x) is exactly the same as maximizing g(f(x)) where g is a monotonic function. You can trivially turn a bounded function into an unbounded one without changing anything to the solution sets.
Even if utility is bounded between 0 and 1, an agent maximizing the expected utility will still never stop, because you can always decrease the probability you were wrong. Quadruple-check every single step and turn the universe into computronium to make sure you didnāt make any errors.
This is very dumb, Lecun should know better, and Iām sure he *would* know better if he spent 5 minutes thinking about any of this.
Yann LeCunās proposals are based on cost-minimization.
Do you expect Lecun to have been assuming that the entire field of RL stops existing in order to focus on his specific vision?
Iām not sure he has coherent expectations, but Iād expect his vibe is some combination of āRL doesnāt currently workā and āfields generally implement safety standardsā.
Another objection is that you can minimize the wrong cost function. Making ācostā go to zero could mean making āthe thing we actually care aboutā go to (negative huge number).
I donāt think this objection lands unless one first sees why the safety guarantees we usually associate with cost minimization donāt apply to AGI. Like what sort of mindset would hear Yann LeCunās objection, go āah, so weāre safeā, and then hear your objection, and go āoh I see, so Yann LeCun was wrongā?
One way to minimize costs is to kill all humans, then money loses all meaning, and the cost of anything is zero.
Dear Yan LeCun, dear all,
Time to reveal myself: Iām actually just a machine designed to minimize cost. Itās a sort of weighted cost of deviation from a few competing aims I harbor.
And, dear Yan LeCun, while I wish it was true, itās absolutely laughable to claim Iād be unable do implement things none of you like, if you gave me enough power (i.e. intelligence).
ā.
I mean to propose this as a trivial proof by contradiction against his proposition. Or am I overlooking sth?? I guess 1. I can definitely be implemented by what we might call cost minimizationf[1], and sadly, however benign my todayās aims in theory, 2. I really donāt think anyone can fully trust me or the average human if any of us got infinitely powerful.[2] So, suffices to think about us humans to see the supposed āEngineersāā² (euhh) logic falter, no?
Whether with or without a strange loop making me (or if you want making it appear to myself that I would be) sentient doesnāt even matter for the question.
Say, Iād hope Iād do great stuff, be a huge savior, but who really knows, and, either way, still rather plausible that Iād do things a large share of people might find rather dystopian.