If I understand correctly, what you want must be more like “restrict the domain of the task before plugging it into the optimiser,” and less like “restrict the output of the optimiser.”
I don’t know how to do that agnostically, however, because optimisers in general have the domain of the task baked in. Indeed the expression for a class of optimisers is JP(X,R), with X in it.
Considering better-than-average optimisers from your example, they are a class with a natural notion of “domain of the task” to tweak, so I can naturally map any initial optimiser to a new one with a restricted task domain: JP(X,R)→JP(Xlegal,R), by taking the mean over Xlegal.
But given a otherwise unspecified ψ∈JP(X,R), I don’t see a natural way to define a ψ′∈JP(Xlegal,R).
Assuming there’s no more elegant answer than filtering for that (ψ′(u)=ψ(u)∩Xlegal), then the question must be: is there another minimally restrictive class of optimisers with such a natural notion, which is not the one with the “detested element” ⊥ already proposed by the OP?
Try 1: consequentialist optimisers, plus the assumption u(Xlegal)=u(X), i.e., the legal moves do not restrict the possible payoffs. Then, since the optimiser picks actions only through u−1(r), for each r I can delete illegal actions from the preimage, without creating new broken outputs. However, this turns out to be just filtering, so it’s not an interesting case.
Try 2: the minimal distill of try 1 is that the output either is empty or contains legal moves already, and then I filter, so yeah not an interesting idea.
Try 3: invariance under permutation of something? A task invariant under permutation of x is just a constant task. An optimiser “invariant under permutation of X” does not even mean something.
Try 4: consider a generic map X→JP(X,R). This does not say anything, it’s the baseline.
Try 5: analyse the structure of a specific example. The better-than-average class of optimisers is ψ(u:X→R)={x:u(x)≥∑r∈Rr/|R|}. It is consequentialist and context-independent. I can’t see how to generalize something mesospecific here.
Ok, take 2.
If I understand correctly, what you want must be more like “restrict the domain of the task before plugging it into the optimiser,” and less like “restrict the output of the optimiser.”
I don’t know how to do that agnostically, however, because optimisers in general have the domain of the task baked in. Indeed the expression for a class of optimisers is JP(X,R), with X in it.
Considering better-than-average optimisers from your example, they are a class with a natural notion of “domain of the task” to tweak, so I can naturally map any initial optimiser to a new one with a restricted task domain: JP(X,R)→JP(Xlegal,R), by taking the mean over Xlegal.
But given a otherwise unspecified ψ∈JP(X,R), I don’t see a natural way to define a ψ′∈JP(Xlegal,R).
Assuming there’s no more elegant answer than filtering for that (ψ′(u)=ψ(u)∩Xlegal), then the question must be: is there another minimally restrictive class of optimisers with such a natural notion, which is not the one with the “detested element” ⊥ already proposed by the OP?
Try 1: consequentialist optimisers, plus the assumption u(Xlegal)=u(X), i.e., the legal moves do not restrict the possible payoffs. Then, since the optimiser picks actions only through u−1(r), for each r I can delete illegal actions from the preimage, without creating new broken outputs. However, this turns out to be just filtering, so it’s not an interesting case.
Try 2: the minimal distill of try 1 is that the output either is empty or contains legal moves already, and then I filter, so yeah not an interesting idea.
Try 3: invariance under permutation of something? A task invariant under permutation of x is just a constant task. An optimiser “invariant under permutation of X” does not even mean something.
Try 4: consider a generic map X→JP(X,R). This does not say anything, it’s the baseline.
Try 5: analyse the structure of a specific example. The better-than-average class of optimisers is ψ(u:X→R)={x:u(x)≥∑r∈Rr/|R|}. It is consequentialist and context-independent. I can’t see how to generalize something mesospecific here.
Time out.