Solutions to optimization problems tend to set to extreme values all those variables which aren’t explicitly constrained. The question then is which ideals we’re willing to sacrifice in order to achieve our primary goals.
No, the question is why you’re employing algorithms that routinely tell you to drink 500 gallons of vinegar per day, sterilize the poor, or take other obviously ridiculous actions.
It is probably usually better to just use probabilistic constraint methods, in which solutions that meet your constraints better are more likely—but all other variables are allowed to vary randomly subject to the minimum necessary causal constraints—and sample until you find a satisfactory solution.
It is probably usually better to just use probabilistic constraint methods, in which solutions that meet your constraints better are more likely—but all other variables are allowed to vary randomly subject to the minimum necessary causal constraints—and sample until you find a satisfactory solution.
This objection reads as both rude and wrong to me. First, it seems rude to link to a paper to explain a term when that paper neither explains nor embodies that term; it reads as just trying to browbeat your audience into submission.
Second, it seems wrong because the underlying issue that you’re unhappy with—a constraint or objective that we consider important is left out—is not solved by sampling from the solution space instead of deterministically finding optimal points. There’s no morality from randomness! The methodology of ‘preferentially sample until you find a satisfactory solution’ is not appreciably better than ‘iteratively add constraints until you find a satisfactory solution,’ and to my eyes it’s considerably worse. (Specifically, what groups of optimization methods are you comparing, and what feature of the latter group makes them superior on what metric?)
And if your complaint really is “maximization has problems,” then say that maximization has problems, not that sampling from a distribution (which, in the limit, maximizes) solves those problems.
Second, it seems wrong because the underlying issue that you’re unhappy with—a constraint or objective that we consider important is left out—is not solved by sampling from the solution space instead of deterministically finding optimal points. There’s no morality from randomness!
“There is no morality from randomness” is not exactly dealing with the point under contest. I am effectively claiming that one should treat selection of social policies as a constraint-satisfaction problem, precisely because treating it as an optimization problem throws out subconscious constraints by default, which makes optimization methods mostly useless when we can’t directly write down precisely the one and only objective function we care about.
I am effectively claiming that one should treat selection of social policies as a constraint-satisfaction problem
So, there’s a class of problems where the hard part is finding a solution that satisfies all the constraints (i.e. a feasible solution). “Is it possible to pack the boxes on this list into a truck following these rules?” It’s generally better to use optimization methods than generic sampling or satisfiability methods, because it can provide near-feasible solutions (“this is the plan that gets the most boxes on the truck”) and can be much faster.
But I don’t think that’s the problem class under discussion, which is some mixture of “what social policies should we support / what should we do with our charitable energy.” If someone says, “I want to reduce the damage done by blindness, please advise,” they’re talking about a maximization problem with many feasible solutions, not a feasibility problem, because it’s easy to come up with a very broad range of things they could do to reduce the damage done by blindness.
The approach you’re recommending seems like it would cash out as “well, I got a list of charities with “blind” in their name, and randomly sampled five from that list. Maybe you should donate to one of them!”
Another way to look at this is ‘absolute constraints’ vs. ‘relative constraints.’ The absolute constraints are the same regardless of what solutions exist (or don’t), the relative constraints are defined only in terms of other solutions. The core insight of EA is that it makes sense to take relative constraints into account when doing charitable donations—it is more good to donation to more effective charities. If we discover that one health charity generates a QALY for a thousand dollars, then we can implicitly add the constraint that all health charities have to generate at least one QALY for every thousand dollars we give them.
I agree that there’s reason to be suspicious of automatically generated relative constraints, but I think that there are better approaches to take to resolving that suspicion than moving to pure sampling.
So, “probabilistic constraint methods” is a trigram that I don’t think has an established unique technical meaning, and the various bigrams that can be formed from it don’t seem quite correct to me. I suppose the thing I’m objecting to the most is the absence of ‘sampling,’ but even then it has to be clear that the thing being sampled is points from the distribution, rather than constraints from the constraint set.
At least in the plate example, they aren’t “probabilistic constraints” in that the constraints are satisfied with some probability*--the way the method works is that they put a ‘uniform’ prior over ‘all’ layouts (that both satisfy the constraints and don’t), then update on the fact that the layout satisfied the constraints to get a posterior distribution. (The thing that seems to be specific to that paper is that they have a reasonable prior distribution over open layouts, which allows them to extend the method to that domain.)
(A related idea is also a thing in optimization—start with a prior distribution over feasible solutions, update with a likelihood function that’s the utility function to get a posterior, and then try to estimate the mode of the distribution using standard statistical tools like MCMC.)
eli_sennesh’s primary point, as far as I can tell, is that we should sample from the feasible region and then use our human judgment on a population of candidates, rather than trying to optimize using machine judgment and then only consider the one candidate it produces. But none of the three words of the trigram deal directly with that claim!
*Though the method can handle that gracefully, for the obvious reasons.
a trigram that I don’t think has an established unique technical meaning
For what it’s worth, that’s my impression too. So I take it Eli is coining his own term; I don’t see anything wrong with that.
I suppose the thing I’m objecting to the most is the absence of ‘sampling’
I take it you mean that you’d like the 3-word description to include “sampling”, rather than that the 3-word description implies sampling that isn’t being done (which is how I first misinterpreted your comment!). I agree that a description with the word “sampling” in might have been more informative—but probably necessarily longer too.
they aren’t “probabilistic constraints” in that the constraints are satisfied with some probability
I was parsing the phrase as (probabilistic (constraint methods)) rather than ((probabilistic constraint) methods) and therefore wasn’t expecting to see the constraints being satisfied only with some probability.
Anyway: It’s possible that Eli didn’t choose the best possible 3-word description for the class of methods he had in mind. But that seems a quite different complaint than that the paper doesn’t embody the term as Eli meant it.
No, the question is why you’re employing algorithms that routinely tell you to drink 500 gallons of vinegar per day, sterilize the poor, or take other obviously ridiculous actions.
It is probably usually better to just use probabilistic constraint methods, in which solutions that meet your constraints better are more likely—but all other variables are allowed to vary randomly subject to the minimum necessary causal constraints—and sample until you find a satisfactory solution.
This objection reads as both rude and wrong to me. First, it seems rude to link to a paper to explain a term when that paper neither explains nor embodies that term; it reads as just trying to browbeat your audience into submission.
Second, it seems wrong because the underlying issue that you’re unhappy with—a constraint or objective that we consider important is left out—is not solved by sampling from the solution space instead of deterministically finding optimal points. There’s no morality from randomness! The methodology of ‘preferentially sample until you find a satisfactory solution’ is not appreciably better than ‘iteratively add constraints until you find a satisfactory solution,’ and to my eyes it’s considerably worse. (Specifically, what groups of optimization methods are you comparing, and what feature of the latter group makes them superior on what metric?)
And if your complaint really is “maximization has problems,” then say that maximization has problems, not that sampling from a distribution (which, in the limit, maximizes) solves those problems.
Sorry for any apparent rudeness.
“There is no morality from randomness” is not exactly dealing with the point under contest. I am effectively claiming that one should treat selection of social policies as a constraint-satisfaction problem, precisely because treating it as an optimization problem throws out subconscious constraints by default, which makes optimization methods mostly useless when we can’t directly write down precisely the one and only objective function we care about.
So, there’s a class of problems where the hard part is finding a solution that satisfies all the constraints (i.e. a feasible solution). “Is it possible to pack the boxes on this list into a truck following these rules?” It’s generally better to use optimization methods than generic sampling or satisfiability methods, because it can provide near-feasible solutions (“this is the plan that gets the most boxes on the truck”) and can be much faster.
But I don’t think that’s the problem class under discussion, which is some mixture of “what social policies should we support / what should we do with our charitable energy.” If someone says, “I want to reduce the damage done by blindness, please advise,” they’re talking about a maximization problem with many feasible solutions, not a feasibility problem, because it’s easy to come up with a very broad range of things they could do to reduce the damage done by blindness.
The approach you’re recommending seems like it would cash out as “well, I got a list of charities with “blind” in their name, and randomly sampled five from that list. Maybe you should donate to one of them!”
Another way to look at this is ‘absolute constraints’ vs. ‘relative constraints.’ The absolute constraints are the same regardless of what solutions exist (or don’t), the relative constraints are defined only in terms of other solutions. The core insight of EA is that it makes sense to take relative constraints into account when doing charitable donations—it is more good to donation to more effective charities. If we discover that one health charity generates a QALY for a thousand dollars, then we can implicitly add the constraint that all health charities have to generate at least one QALY for every thousand dollars we give them.
I agree that there’s reason to be suspicious of automatically generated relative constraints, but I think that there are better approaches to take to resolving that suspicion than moving to pure sampling.
I’ve only glanced at the paper, but it looks to me like it does embody the term (“probabilistic constraint methods”). What am I missing?
So, “probabilistic constraint methods” is a trigram that I don’t think has an established unique technical meaning, and the various bigrams that can be formed from it don’t seem quite correct to me. I suppose the thing I’m objecting to the most is the absence of ‘sampling,’ but even then it has to be clear that the thing being sampled is points from the distribution, rather than constraints from the constraint set.
At least in the plate example, they aren’t “probabilistic constraints” in that the constraints are satisfied with some probability*--the way the method works is that they put a ‘uniform’ prior over ‘all’ layouts (that both satisfy the constraints and don’t), then update on the fact that the layout satisfied the constraints to get a posterior distribution. (The thing that seems to be specific to that paper is that they have a reasonable prior distribution over open layouts, which allows them to extend the method to that domain.)
(A related idea is also a thing in optimization—start with a prior distribution over feasible solutions, update with a likelihood function that’s the utility function to get a posterior, and then try to estimate the mode of the distribution using standard statistical tools like MCMC.)
eli_sennesh’s primary point, as far as I can tell, is that we should sample from the feasible region and then use our human judgment on a population of candidates, rather than trying to optimize using machine judgment and then only consider the one candidate it produces. But none of the three words of the trigram deal directly with that claim!
*Though the method can handle that gracefully, for the obvious reasons.
For what it’s worth, that’s my impression too. So I take it Eli is coining his own term; I don’t see anything wrong with that.
I take it you mean that you’d like the 3-word description to include “sampling”, rather than that the 3-word description implies sampling that isn’t being done (which is how I first misinterpreted your comment!). I agree that a description with the word “sampling” in might have been more informative—but probably necessarily longer too.
I was parsing the phrase as (probabilistic (constraint methods)) rather than ((probabilistic constraint) methods) and therefore wasn’t expecting to see the constraints being satisfied only with some probability.
Anyway: It’s possible that Eli didn’t choose the best possible 3-word description for the class of methods he had in mind. But that seems a quite different complaint than that the paper doesn’t embody the term as Eli meant it.