I had a little tinker with this. It’s straightforward to choose a utility function where maximising it is equivalent to minimizing H(Z) - just set U(z)=logp(z).
As far as I can see, the other way round is basically as you suggested, but a tiny bit more fiddly. We can indeed produce a nonpositive U′ from which we make a new RV Z′ with H(Z′|z)=−U(z) as you suggested (e.g. with coinflips etc). But a simple shift of U isn’t enough. We need U′(z)=mU(z)−logp(z)+c (for some scalar m and c) - note the logp(z) term.
We take the z′ outcomes to be partitioned by z, i.e. they’re ′z happened and also I got z′ coinflip outcome’. Then P(z′)=P(z)P(z′|z) (where z is understood to be the particular z associated with z′). That means H(Z|Z′)=0 so H(Z′)=H(Z)+H(Z′|Z) (you can check this by spelling things out pointfully and rearranging, but I realised that I was just rederiving conditional entropy laws).
so minimizing H(Z′) is equivalent to maximising EzU(z).
Requiring U′(z)=mU(z)−logp(z)+c to be nonpositive for all z maybe places more constraints on things? Certainly U needs to be bounded above as you said. It’s also a bit weird and embedded as you hinted, because this utility function depends on the probability of the outcome, which is the thing being controlled/regulated by the decisioner. I don’t know if there are systems where this might not be well-defined even for bounded U, haven’t dug into it.
I had a little tinker with this. It’s straightforward to choose a utility function where maximising it is equivalent to minimizing H(Z) - just set U(z)=logp(z).
As far as I can see, the other way round is basically as you suggested, but a tiny bit more fiddly. We can indeed produce a nonpositive U′ from which we make a new RV Z′ with H(Z′|z)=−U(z) as you suggested (e.g. with coinflips etc). But a simple shift of U isn’t enough. We need U′(z)=mU(z)−logp(z)+c (for some scalar m and c) - note the logp(z) term.
We take the z′ outcomes to be partitioned by z, i.e. they’re ′z happened and also I got z′ coinflip outcome’. Then P(z′)=P(z)P(z′|z) (where z is understood to be the particular z associated with z′). That means H(Z|Z′)=0 so H(Z′)=H(Z)+H(Z′|Z) (you can check this by spelling things out pointfully and rearranging, but I realised that I was just rederiving conditional entropy laws).
Then
−H(Z′)=−H(Z)−H(Z′|Z)=−H(Z)+EzU′(z)=−H(Z)+mEzU(z)−Ezlogp(z)+c=mEzU(z)+c
so minimizing H(Z′) is equivalent to maximising EzU(z).
Requiring U′(z)=mU(z)−logp(z)+c to be nonpositive for all z maybe places more constraints on things? Certainly U needs to be bounded above as you said. It’s also a bit weird and embedded as you hinted, because this utility function depends on the probability of the outcome, which is the thing being controlled/regulated by the decisioner. I don’t know if there are systems where this might not be well-defined even for bounded U, haven’t dug into it.