I would be interested to see a sketch of how you mathematize “agent” such that gradient descent could be said to not have a utility function. Best as I can tell, “having a utility function” is a noninteresting property that everything has—a sort of panagentism implied by trivial utility functions. Though nontriviality of utility functions might be able to define what you’re talking about, and I can imagine there being some nontriviality definitions that do exclude gradient descent over boxed parameters, eg that there’s no time td where the utility function becomes indifferent. Any utility function that only cares about the weights becomes indifferent in finite time, I think? so this should exclude the “just sit here being a table” utility function. Although, perhaps this is insufficiently defined because I haven’t specified what physical mechanism to extract as the preference ordering in some cases in which case there could totally be agents. I’d be curious how you try to define this sort of thing, anyway.
(that is to say that for utility function Usuch that there are no worldlines Wa(t) and Wb(t) that diverge at td such that U(wa)=U(wb); call that constraint 1, “never being indifferent between timelines”. though that version of the constraint might demand that the utility function never be indifferent to anything at all, so perhaps a weaker constraint might be that there be no time where the utility function is indifferent to all possible worldlines; ie, if constraint 1 is no worldlines that diverge and yet get the same order position, constraint 2 is at all times td there are at least one unique Wa(t) and Wb(t) that diverge at td such that U(wa)≠U(wb). diverge being defined as Wa(te)=Wb(te) for early times te<td and Wa(tl)≠Wb(tl) for some tl>td.)
I would be interested to see a sketch of how you mathematize “agent” such that gradient descent could be said to not have a utility function. Best as I can tell, “having a utility function” is a noninteresting property that everything has—a sort of panagentism implied by trivial utility functions. Though nontriviality of utility functions might be able to define what you’re talking about, and I can imagine there being some nontriviality definitions that do exclude gradient descent over boxed parameters, eg that there’s no time td where the utility function becomes indifferent. Any utility function that only cares about the weights becomes indifferent in finite time, I think? so this should exclude the “just sit here being a table” utility function. Although, perhaps this is insufficiently defined because I haven’t specified what physical mechanism to extract as the preference ordering in some cases in which case there could totally be agents. I’d be curious how you try to define this sort of thing, anyway.
(that is to say that for utility function Usuch that there are no worldlines Wa(t) and Wb(t) that diverge at td such that U(wa)=U(wb); call that constraint 1, “never being indifferent between timelines”. though that version of the constraint might demand that the utility function never be indifferent to anything at all, so perhaps a weaker constraint might be that there be no time where the utility function is indifferent to all possible worldlines; ie, if constraint 1 is no worldlines that diverge and yet get the same order position, constraint 2 is at all times td there are at least one unique Wa(t) and Wb(t) that diverge at td such that U(wa)≠U(wb). diverge being defined as Wa(te)=Wb(te) for early times te<td and Wa(tl)≠Wb(tl) for some tl>td.)