Gurkenglas comments on Contest: $1,000 for good questions to ask to an Oracle AI

Gurkenglas 27 Aug 2019 2:04 UTC
3 points
In the worst case, the UFAIs cooperate and choose their ten list entries adversially to our protocol. The subspace of Message^100 within which we might as well assume that we get the worst outcome lies in a ball of radius 10, aka the sample lists that can be reached in ten adjacency steps from some list.
My differential privacy knowledge comes from a broader university course. Its idea is that you are managing private data, perhaps medical, and want to be able to answer database queries without comprimising individuals. So you randomize query answers, to gurantee that changing an individual entry makes little difference to the possible query outputs. If going one adjacency step multiplies the probability of each possible output by between 0.99 and 1.01, ten steps will stay within [0.99^10, 1.01^10]. The same theory should let you make sure you can’t hear Cthulhu’s murmurs in the noise. If each output probability can be changed only by a factor of two by moving within the ball, that’s one bit of optimization power for the UFAIs.
An easy way to gurantee that a query which counts some subset of your list is differentially private is to use a Laplace distribution: Note how every step to the side divides the probability by 1.01.
I had hoped a reader of the first paragraph to see that there should be safer queries than sampling the distribution, and to perhaps need a google keyword like differential privacy to design it. Then I expected to be asked for more detail, so I added a paragraph that I hoped to be understandable as-is. :(
- Wei Dai 27 Aug 2019 5:29 UTC
  2 points
  Parent
  Ok, I think I understand now. But it seems like imposing a differential privacy constraint on the query makes many desirable uses of counterfactual oracles (such as all of my submissions) impossible. Correct? You gave the example of “asking how many messages sound panicked to some dumb text processor” but that doesn’t seem hugely useful. Do you have any other ideas?
  - Gurkenglas 27 Aug 2019 11:32 UTC
    3 points
    Parent
    We can make any query differentially private given a metric map (one that doesn’t increase any distance) from Message^100 to a space of possible query outputs. Set the probability mass of each query output to 0.99^(steps removed from the default answer). (Then normalize.)
    For the identity metric map of a human just trying to read the sample list, this scrambles it entirely. The metric map image needs to branch less combinatorially for this not to happen.
    One metric map image metric d(a,b) that comes to mind is one that bounds the utility cost of getting answer b instead of a. For example, we could ask the counterfactual humans to send back stock market prices, and try to calculate a trading policy that is profitable even if some forecasts are fake. And then, whoever is willing to assume the lowest UFAI probability wins the market! x(