No, I was talking about the results. lsusr seems to use the term in a different sense than Scott Alexander or Yann LeCun. In their sense it’s not an alternative to backpropagation, but a way of constantly predicting future experience and to constantly update a world model depending on how far off those predictions are. Somewhat analogous to conditionalization in Bayesian probability theory.
I haven’t watched the LeCun interview you reference (it is several hours long, so relevant time-stamps to look at would be appreciated), but this still does not make sense to me—backprop already seems like a way to constantly predict future experience and update, particularly as it is employed in LLMs. Generating predictions first and then updating based on error is how backprop works. Some form of closeness measure is required, just like you emphasize.
Well, backpropagation alone wasn’t even enough to make efficient LLMs feasible. It took decades, till the invention of transformers, to make them work. Similarly, knowing how to make LLMs is not yet sufficient to implement predictive coding. LeCun talks about the problem in a short section here from 10:55 to 14:19.
I haven’t watched the LeCun interview you reference (it is several hours long, so relevant time-stamps to look at would be appreciated), but this still does not make sense to me—backprop already seems like a way to constantly predict future experience and update, particularly as it is employed in LLMs. Generating predictions first and then updating based on error is how backprop works. Some form of closeness measure is required, just like you emphasize.
Well, backpropagation alone wasn’t even enough to make efficient LLMs feasible. It took decades, till the invention of transformers, to make them work. Similarly, knowing how to make LLMs is not yet sufficient to implement predictive coding. LeCun talks about the problem in a short section here from 10:55 to 14:19.