I repeat the construction using each element of R in turn in place of ⊥.
For each such u′, ψ(u′) spits out a set. I pick the u′ producing the set which is closer to Xlegal.
For finite sets, I could look at the size of the intersection; more in general I’ll need some metric on P(X) to say what it means to be close to Xlegal.
Even with finite sets, it doesn’t work because the idea to look for “closest to Xlegal” is not what we looking for.
Let X={Alice;Bob;Charlie} a class of students, scoring within a range R={0;1;2;3;4;5}, u:X→R.
Let ψ the (uniform) better-than-average optimizer, standing for the professor picking any student who scores better than the mean.
Let Xlegal={Alice;Bob} (the professor ψ despises Charlie and ignores him totally).
If u(Alice) = 5 and u(Bob) = 4, their average is 4.5 so only Alice should be picked by the constrained optimisation.
Howewer, with your proposal, you run into trouble with u’ defined as u’(Alice) = u(Alice), u’(Bob) = u(Bob), and u’(Charlie) = 0.
The average value for this u’ is (5+4+0)/3=3, and both Alice and Bob scores better than 3: ψ(u′)={Alice;Bob}=Xlegal. The size of the intersection with Xlegal is then maximal, so your proposal suggests to pick this set as the result. But the actual result should be {Alice}, because the score of Charlie is irrelevant to constrained optimisation.
If I understand correctly, what you want must be more like “restrict the domain of the task before plugging it into the optimiser,” and less like “restrict the output of the optimiser.”
I don’t know how to do that agnostically, however, because optimisers in general have the domain of the task baked in. Indeed the expression for a class of optimisers is JP(X,R), with X in it.
Considering better-than-average optimisers from your example, they are a class with a natural notion of “domain of the task” to tweak, so I can naturally map any initial optimiser to a new one with a restricted task domain: JP(X,R)→JP(Xlegal,R), by taking the mean over Xlegal.
But given a otherwise unspecified ψ∈JP(X,R), I don’t see a natural way to define a ψ′∈JP(Xlegal,R).
Assuming there’s no more elegant answer than filtering for that (ψ′(u)=ψ(u)∩Xlegal), then the question must be: is there another minimally restrictive class of optimisers with such a natural notion, which is not the one with the “detested element” ⊥ already proposed by the OP?
Try 1: consequentialist optimisers, plus the assumption u(Xlegal)=u(X), i.e., the legal moves do not restrict the possible payoffs. Then, since the optimiser picks actions only through u−1(r), for each r I can delete illegal actions from the preimage, without creating new broken outputs. However, this turns out to be just filtering, so it’s not an interesting case.
Try 2: the minimal distill of try 1 is that the output either is empty or contains legal moves already, and then I filter, so yeah not an interesting idea.
Try 3: invariance under permutation of something? A task invariant under permutation of x is just a constant task. An optimiser “invariant under permutation of X” does not even mean something.
Try 4: consider a generic map X→JP(X,R). This does not say anything, it’s the baseline.
Try 5: analyse the structure of a specific example. The better-than-average class of optimisers is ψ(u:X→R)={x:u(x)≥∑r∈Rr/|R|}. It is consequentialist and context-independent. I can’t see how to generalize something mesospecific here.
Exercise 6:
I repeat the construction using each element of R in turn in place of ⊥.
For each such u′, ψ(u′) spits out a set. I pick the u′ producing the set which is closer to Xlegal.
For finite sets, I could look at the size of the intersection; more in general I’ll need some metric on P(X) to say what it means to be close to Xlegal.
Does that make sense?
Even with finite sets, it doesn’t work because the idea to look for “closest to Xlegal” is not what we looking for.
Let X={Alice;Bob;Charlie} a class of students, scoring within a range R={0;1;2;3;4;5}, u:X→R. Let ψ the (uniform) better-than-average optimizer, standing for the professor picking any student who scores better than the mean.
Let Xlegal={Alice;Bob} (the professor ψ despises Charlie and ignores him totally).
If u(Alice) = 5 and u(Bob) = 4, their average is 4.5 so only Alice should be picked by the constrained optimisation.
Howewer, with your proposal, you run into trouble with u’ defined as u’(Alice) = u(Alice), u’(Bob) = u(Bob), and u’(Charlie) = 0.
The average value for this u’ is (5+4+0)/3=3, and both Alice and Bob scores better than 3: ψ(u′)={Alice;Bob}=Xlegal. The size of the intersection with Xlegal is then maximal, so your proposal suggests to pick this set as the result. But the actual result should be {Alice}, because the score of Charlie is irrelevant to constrained optimisation.
Ok, take 2.
If I understand correctly, what you want must be more like “restrict the domain of the task before plugging it into the optimiser,” and less like “restrict the output of the optimiser.”
I don’t know how to do that agnostically, however, because optimisers in general have the domain of the task baked in. Indeed the expression for a class of optimisers is JP(X,R), with X in it.
Considering better-than-average optimisers from your example, they are a class with a natural notion of “domain of the task” to tweak, so I can naturally map any initial optimiser to a new one with a restricted task domain: JP(X,R)→JP(Xlegal,R), by taking the mean over Xlegal.
But given a otherwise unspecified ψ∈JP(X,R), I don’t see a natural way to define a ψ′∈JP(Xlegal,R).
Assuming there’s no more elegant answer than filtering for that (ψ′(u)=ψ(u)∩Xlegal), then the question must be: is there another minimally restrictive class of optimisers with such a natural notion, which is not the one with the “detested element” ⊥ already proposed by the OP?
Try 1: consequentialist optimisers, plus the assumption u(Xlegal)=u(X), i.e., the legal moves do not restrict the possible payoffs. Then, since the optimiser picks actions only through u−1(r), for each r I can delete illegal actions from the preimage, without creating new broken outputs. However, this turns out to be just filtering, so it’s not an interesting case.
Try 2: the minimal distill of try 1 is that the output either is empty or contains legal moves already, and then I filter, so yeah not an interesting idea.
Try 3: invariance under permutation of something? A task invariant under permutation of x is just a constant task. An optimiser “invariant under permutation of X” does not even mean something.
Try 4: consider a generic map X→JP(X,R). This does not say anything, it’s the baseline.
Try 5: analyse the structure of a specific example. The better-than-average class of optimisers is ψ(u:X→R)={x:u(x)≥∑r∈Rr/|R|}. It is consequentialist and context-independent. I can’t see how to generalize something mesospecific here.
Time out.