I think the term is very reasonable and basically accurate, even more so with regard to most RL methods. It’s a good way of describing a training process without implying that the evolving system will head toward optimality deliberately. I don’t know a better way to communicate this succinctly, especially while not being specific about what local search algorithm is being used.
Also, evolutionary algorithms can be used to approximate gradient descent (with noisier gradient estimates), so it’s not unreasonable to use similar language about both.
I’m not a huge fan of the way you imply that it was chosen for rhetorical purposes.
I’m not a huge fan of the way you imply that it was chosen for rhetorical purposes.
To be clear, I’m not alleging mal-intent or anything. I’m more pointing out memetic dynamics. The situation can look as innocent as “You genuinely believe X, and think it’s important for people to get X, and so you iterate over explanations until you find an effective one.” And maybe that explanation just happens to involve analogizing that ML “selects for low loss.”
I think the term is very reasonable and basically accurate, even more so with regard to most RL methods. It’s a good way of describing a training process without implying that the evolving system will head toward optimality deliberately. I don’t know a better way to communicate this succinctly, especially while not being specific about what local search algorithm is being used.
Also, evolutionary algorithms can be used to approximate gradient descent (with noisier gradient estimates), so it’s not unreasonable to use similar language about both.
I’m not a huge fan of the way you imply that it was chosen for rhetorical purposes.
Without commenting on the rest for now—
To be clear, I’m not alleging mal-intent or anything. I’m more pointing out memetic dynamics. The situation can look as innocent as “You genuinely believe X, and think it’s important for people to get X, and so you iterate over explanations until you find an effective one.” And maybe that explanation just happens to involve analogizing that ML “selects for low loss.”