Ajeya Cotra comments on Worst-case thinking in AI alignment

Ajeya Cotra 23 Dec 2021 5:15 UTC
LW: 33 AF: 15
AF
Just want to draw out and highlight something mentioned in passing in the “You want to solve a problem in as much generality as possible...” section. Not only would it be great if you could solve a problem in the worst case, the worst case assumption is also often radically easier to think about than trying to think about realistic cases. In some sense the worst case assumption is the second-simplest assumption you could possibly make about the empirical situation (the simplest being the best case assumption—“this problem never comes up”). My understanding is that proving theorems about average case phenomena is a huge pain and often comes much after proofs about the worst case bounds.
- jsteinhardt 24 Dec 2021 17:10 UTC
  LW: 13 AF: 8
  AF Parent
  I think this probably depends on the field. In machine learning, solving problems under worst-case assumptions is usually impossible because of the no free lunch theorem. You might assume that a particular facet of the environment is worst-case, which is a totally fine thing to do, but I don’t think it’s correct to call it the “second-simplest solution”, since there are many choices of what facet of the environment is worst-case.
  One keyword for this is “partial specification”, e.g. here is a paper I wrote that makes a minimal set of statistical assumptions and worst-case assumptions everywhere else: https://arxiv.org/abs/1606.05313. (Unfortunately the statistical assumptions are not really reasonable so the method was way too brittle in practice.) This kind of idea is also common in robust statistics. But my take would not be that it is simpler—in general it is way harder than just working with the empirical distribution in front of you.
  - davidad 27 Dec 2021 11:45 UTC
    LW: 6 AF: 2
    AF Parent
    My interpretation of the NFL theorems is that solving the relevant problems under worst-case assumptions is too easy, so easy it’s trivial: a brute-force search satisfies the criterion of worst-case optimality. So, that being settled, in order to make progress, we have to step up to average-case evaluation, which is harder.
    (However, I agree that once we already need to do some averaging, making explicit and stripping down the statistical assumptions and trying to get closer to worst-case guarantees—without making the problem trivial again—is harder than just evaluating empirically against benchmarks.)
    What links here?
    Noosphere89's comment on .CLI’s Shortform by .CLI (1 Sep 2024 17:34 UTC; 2 points)
    - jsteinhardt 27 Dec 2021 16:09 UTC
      LW: 2 AF: 1
      0
      AF Parent
      Finding the min-max solution might be easier, but what we actually care about is an acceptable solution. My point is that the min-max solution, in most cases, will be unacceptably bad.
      
      And in fact, since min_x f(theta,x) ⇐ E_x[f(theta,x)], any solution that is acceptable in the worst case is also acceptable in the average case.
      - davidad 28 Dec 2021 9:02 UTC
        LW: 6 AF: 2
        AF Parent
        Agreed—although optimizing for the worst case is usually easier than optimizing for the average case, satisficing for the worst case is necessarily harder (and, in ML, typically impossible) than satisficing for the average case.
  - paulfchristiano 28 Dec 2021 16:29 UTC
    LW: 2 AF: 2
    AF Parent
    I think this probably depends on the field. In machine learning, solving problems under worst-case assumptions is usually impossible because of the no free lunch theorem. You might assume that a particular facet of the environment is worst-case, which is a totally fine thing to do, but I don’t think it’s correct to call it the “second-simplest solution”, since there are many choices of what facet of the environment is worst-case.
    Even in ML it seems like it depends on how you formulated your problem/goal. Making good predictions in the worst case is impossible, but achieving low regret in the worst case is sensible. (Though still less useful than just “solve existing problems and then try the same thing tomorrow,” and generally I’d agree “solve an existing problem for which you can verify success” is the easiest thing to do.) Hopefully having your robot not deliberately murder you is a similarly sensible goal in the worst case though it remains to be seen if it’s feasible.
- davidad 27 Dec 2021 11:25 UTC
  LW: 9 AF: 5
  0
  AF Parent
  To elaborate this formally,
  - ${max}_{θ} {max}_{x} f (θ, x)$ is best-case
  - ${max}_{θ} {min}_{x} f (θ, x)$ is worst-case
  - ${max}_{θ} E_{x} f (θ, x)$ is average-case
  $max$ and $min$ are both “easier” monoids than $E$ essentially because of dominance relations; for any $θ$ , there’s going to be a single $x$ that dominates all others, in the sense that all other $x^{'} \neq x$ can be excluded from consideration and have no impact on the outcome. Whereas when calculating $E$ , the only $x^{'}$ that can be excluded are those outside the distribution’s support.
  $max$ is even easier than $min$ because it commutes with the outer $max$ ; not only is there a single $x$ that dominates all others, it doesn’t necessarily even depend on $θ$ (the problem can be solved as ${max}_{x} {max}_{θ} f (θ, x)$ or ${max}_{θ, x} f (θ, x)$ ). As a concrete example, the best case for nearly any sorting algorithm is already-sorted input, whereas the worst case depends more on which algorithm is being examined.