Wouldn’t any moderately complex optimiser be likely to have some model of a probability spread, or distribution, rather than holding all its knowledge to a single number? Intuitively it seems to me to be useful to be able to model the difference between, say, on one hand a probability density that’s (roughly) Gaussian-shaped with a mean of 0.5 and an sd of 0.02, and on the other hand a delta function at 0.5. With both, your single-value estimate of the probability is 0.5, but your certainty in that estimate is different.
For such a machine one wouldn’t specify “achieve ‘G’ with probability ‘p’”, but rather “achieve ‘G’ such that the probability density below ‘p’ is < f”, where f is some specified (small) fraction of the total. That seems rather more straightforward in many ways.
Wouldn’t any moderately complex optimiser be likely to have some model of a probability spread, or distribution, rather than holding all its knowledge to a single number? Intuitively it seems to me to be useful to be able to model the difference between, say, on one hand a probability density that’s (roughly) Gaussian-shaped with a mean of 0.5 and an sd of 0.02, and on the other hand a delta function at 0.5. With both, your single-value estimate of the probability is 0.5, but your certainty in that estimate is different.
For such a machine one wouldn’t specify “achieve ‘G’ with probability ‘p’”, but rather “achieve ‘G’ such that the probability density below ‘p’ is < f”, where f is some specified (small) fraction of the total. That seems rather more straightforward in many ways.