jessicat comments on FAI Research Constraints and AGI Side Effects

jessicat 9 Jun 2015 6:15 UTC
3 points
(BTW: here’s a writeup of one of my ideas for writing planning queries that you might be interested in)

Often we want a model where the probability of taking action a is proportional to p(a)e^E[U(x, a)], where p is the prior over actions, x consists of some latent variables, and U is the utility function. The straightforward way of doing this fails:
```
query {
      . a ~ p()
      . x ~ P(x)
      . factor(U(x, a))
}
```
Note that I’m assuming factor takes a log probability as its argument. This fails due to “wishful thinking”: it tends to prefer riskier actions. The problem can be reduced by taking more samples:
```
query {
. a ~ p()
. us = []
. for i = 1 to n
.  .  x_i ~ P(x)
.  .  us.append(U(x_i, a))
. factor(mean(us))
}
```
This does better, because since we took multiple samples, mean(us) is likely to be somewhat accurate. But how do we know how many samples to take? The exact query we want cannot be expressed with any finite n.

It turns out that we just need to sample n from a Poisson distribution and make some more adjustments:
```
query {
. a ~ p()
. n ~ Poisson(1)
. for i = 1 to n
. . x_i ~ P(x)
. . factor(log U(x_i, a))
}
```
Note that U must be non-negative. Why does this work? Consider:
```
P(a)    α     p(a) E[e^sum(log U(x_i, a) for i in range(n))]
               = p(a) E[prod(U(x_i, a) for i in range(n))]
               = p(a) E[ E[prod(U(x_i, a) for i in range(n)) | n] ]
               [here use the fact that the terms in the product are independent]
               = p(a) E[ E[U(x, a)]^n ]
               = p(a) sum(i=0 to infinity) E[U(x, a)]^i / i!
               [Taylor series!]
               = p(a) e^E[U(x, a)]
```
Ideally, this technique would help to perform inference in planning models where we can’t enumerate all possible states.
- [deleted] 9 Jun 2015 23:13 UTC
  0 points
  Parent
  Interesting! How does that compare to the usual implementations of planning as probabilistic inference, as exemplified below?
```
(query
. (define a (prior))
. (define x (lambda (a) (world a)))
. (define r (flip (sigmoid (U (x a)))))
. a
. (= r #t))
```
  - jessicat 10 Jun 2015 2:26 UTC
    3 points
    Parent
    Your model selects an action proportional to p(a) E[sigmoid(U) | a], whereas mine selects an action proportional to p(a) e^E[U | a]. I think the second is better, because it actually treats actions the same if they have the same expected utility. The sigmoid version will not take very high utilities or very low utilities into account much.
    
    Btw it’s also possible to select an action proportional to E[U | a]^n:
    
    query { . a ~ p() . for i = 1 to n . . x_i ~ P(x) . . factor(log U(x, a)) }
    - [deleted] 10 Jun 2015 23:38 UTC
      0 points
      Parent
      Could you explain your syntax here? What probabilistic programming language are you using?
      
      I think the second is better, because it actually treats actions the same if they have the same expected utility.
      
      Well so does the sigmoided version, but you are right that the sigmoid version won’t take very high or very low utilities into account. It’s meant to shoehorn unbounded utility functions into a framework where one normally works only with random variables.
      - jessicat 13 Jun 2015 7:00 UTC
        0 points
        Parent
        It’s not a specific programming language, I guess it’s meant to look like Church. It could be written as:
        
        (query . (define a (p)) . (foreach (range n) (lambda i) . . (define x (x-prior)) . . (factor (log (U x a)))))
        
        Well so does the sigmoided version
        
        It samples an action proportional to p(a) E[sigmoid(U) | a]. This can’t be written as a function of E[U | a].