Doesn’t sound like it’d meaningfully change the fundamental dynamics. It’s an intervention on the order of things like Momentum or Adam, and they’re still “basically just the SGD”. Pretty sure similar will be the case here: it may introduce some interesting effects, but won’t actually robustly address the greed.
… My current thoughts is that “how can we design a procedure that takes the shortest and not the steepest path to the AGI?” is just “design it manually”. I. e., the corresponding “training algorithm” we’ll want to replace the SGD with is just “our own general intelligence”.
I’m sorry but I fail to see the analogy to momentum or adam, in neither of which the vector or distance from the current point to the initial point plays any role as far as I can see. It is also different from regularizations that modify the objective function, say to penalize moving away from the initial point, which would change the location of all minima. The method I propose preserves all minima and just tries to move towards the one closest to the initial point. I have discussed it with some mathematical optimization experts and they think it’s new.
Doesn’t sound like it’d meaningfully change the fundamental dynamics. It’s an intervention on the order of things like Momentum or Adam, and they’re still “basically just the SGD”. Pretty sure similar will be the case here: it may introduce some interesting effects, but won’t actually robustly address the greed.
… My current thoughts is that “how can we design a procedure that takes the shortest and not the steepest path to the AGI?” is just “design it manually”. I. e., the corresponding “training algorithm” we’ll want to replace the SGD with is just “our own general intelligence”.
I’m sorry but I fail to see the analogy to momentum or adam, in neither of which the vector or distance from the current point to the initial point plays any role as far as I can see. It is also different from regularizations that modify the objective function, say to penalize moving away from the initial point, which would change the location of all minima. The method I propose preserves all minima and just tries to move towards the one closest to the initial point. I have discussed it with some mathematical optimization experts and they think it’s new.