In very high-dimensional spaces, getting stuck in local minima is harder. Do the same results happen in, say, 10,000 dimensions? If so, what’s the relationship between the number of dimensions and the time (or progress in x_0) before getting stuck in a daemon? If not, is there another function that exhibits easily found daemons in 10,000 dimensions?
Bit late, but running the same experiment with 1000 dimensions instead of 16, and 10k steps instead of 5k gives
Which appears to be on the way to a minima. though I’m unsure if I should tweak hparams when scaling up this much. Trying with other optimizers would be interesting too, but I think I’ve got nerdsniped by this too much already… Code is here.
In very high-dimensional spaces, getting stuck in local minima is harder. Do the same results happen in, say, 10,000 dimensions? If so, what’s the relationship between the number of dimensions and the time (or progress in x_0) before getting stuck in a daemon? If not, is there another function that exhibits easily found daemons in 10,000 dimensions?
Bit late, but running the same experiment with 1000 dimensions instead of 16, and 10k steps instead of 5k gives
Which appears to be on the way to a minima. though I’m unsure if I should tweak hparams when scaling up this much. Trying with other optimizers would be interesting too, but I think I’ve got nerdsniped by this too much already… Code is here.