rmoehn comments on Prosaic AI alignment

rmoehn 3 Jun 2019 8:32 UTC
4 points

Note that even in the extreme case where our approach to AI alignment would be completely different for different values of some unknown details, the speedup from knowing them in advance is at most 1/(probability of most likely possibility).

Shouldn’t this be ‘probability of least likely possibility’?

Suppose we have an unknown detail $X$ with $n$ possible values. What fraction of the total effort should we spend on the approach for $X = i$ ? I guess $P r (X = i)$ if we work on all approaches in parallel. For example, if value $x_{2}$ has probability $\frac{1}{3}$ , we should spend $\frac{1}{3}$ of the total effort on the approach for value $x_{2}$ .

What happens when we find out the true value of $X$ ? Let the true value be $x^{*}$ . Then we can concentrate all the effort on the approach for $x^{*}$ . Whereas previously the fraction of the effort for value $x^{*}$ was $P r (X = x^{*})$ , it’s now 1. Therefore the speedup is $\frac{1}{P r (X = x^{*})}$ . When is this greatest? When $P r (X = x^{*})$ is least.
- paulfchristiano 4 Jun 2019 15:56 UTC
  5 points
  Parent
  I’m imagining the first marginal unit of effort, which you’d apply to the most likely possibility. Its expected impact is reduced by that highest probability.
  If you get unlucky, then your actual impact might be radically lower than if you had known what to work on.
  - rmoehn 5 Jun 2019 0:46 UTC
    3 points
    Parent
    Thanks for the clarification! Now I understand where the value comes from.
    
    However, in such a situation, would we only work on the most likely possibility? I agree that a single person does better by concentrating. But a group of researchers would work on each of the more likely approaches in order to mitigate risk.