evhub comments on Counting arguments provide no evidence for AI doom

evhub 4 Mar 2024 19:49 UTC
LW: 13 AF: 10
4
AF

I myself expected you to respond to this post with some ML-specific reasoning about simplicity and measure of parameterizations, instead of your speculation about a relationship between the universal measure and inductive biases. I spoke with dozens of people about the ideas in OP’s post, and none of them mentioned arguments like the one you gave. I myself have spent years in the space and am also not familiar with this particular argument about bitstrings.

That probably would have been my objection had the reasoning about priors in this post been sound, but since the reasoning was unsound, I turned to the formalism to try to show why it’s unsound.

If these are your real reasons for expecting deceptive alignment, that’s fine, but I think you’ve mentioned this rather infrequently.

I think you’re misunderstanding the nature of my objection. It’s not that Solomonoff induction is my real reason for believing in deceptive alignment or something, it’s that the reasoning in this post is mathematically unsound, and I’m using the formalism to show why. If I weren’t responding to this post specifically, I probably wouldn’t have brought up Solomonoff induction at all.

This yields a perfectly well-defined counting argument over $F$ .

we can parameterize such functions using the neural network parameter space

I’m very happy with running counting arguments over the actual neural network parameter space; the problem there is just that I don’t think we understand it well enough to do so effectively.

You could instead try to put a measure directly over the functions in your setup, but the problem there is that function space really isn’t the right space to run a counting argument like this; you need to be in algorithm space, otherwise you’ll do things like what happens in this post where you end up predicting overfitting rather than generalization (which implies that you’re using a prior that’s not suitable for running counting arguments on).
What links here?
- Many arguments for AI x-risk are wrong by TurnTrout (5 Mar 2024 2:31 UTC; 165 points)
- evhub's comment on Counting arguments provide no evidence for AI doom by Nora Belrose (5 Mar 2024 2:56 UTC; 5 points)
- TurnTrout 4 Mar 2024 20:47 UTC
  LW: 5 AF: 4
  2
  AF Parent
  I’m very happy with running counting arguments over the actual neural network parameter space; the problem there is just that I don’t think we understand it well enough to do so effectively.
  1. This is basically my position as well
  2. The cited argument is a counting argument over the space of functions which achieve zero/low training loss.
  You could instead try to put a measure directly over the functions in your setup, but the problem there is that function space really isn’t the right space to run a counting argument like this; you need to be in algorithm space, otherwise you’ll do things like what happens in this post where you end up predicting overfitting rather than generalization (which implies that you’re using a prior that’s not suitable for running counting arguments on).
  Indeed, this is a crucial point that I think the post is trying to make. The cited counting arguments are counting functions instead of parameterizations. That’s the mistake (or, at least “a” mistake). I’m glad we agree it’s a mistake, but then I’m confused why you think that part of the post is unsound.
  (Rereads)
  Rereading the portion in question now, it seems that they changed it a lot since the draft. Personally, I think their argumentation is now weaker than it was before. The original argumentation clearly explained the mistake of counting functions instead of parameterizations, while the present post does not. It instead abstracts it as “an indifference principle”, where the reader has to do the work to realize that indifference over functions is inappropriate.
  - Nora Belrose 5 Mar 2024 7:00 UTC
    1 point
    −9
    Parent
    I’m sorry to hear that you think the argumentation is weaker now.
    the reader has to do the work to realize that indifference over functions is inappropriate
    I don’t think that indifference over functions in particular is inappropriate. I think indifference reasoning in general is inappropriate.
    I’m very happy with running counting arguments over the actual neural network parameter space
    I wouldn’t call the correct version of this a counting argument. The correct version uses the actual distribution used to initialize the parameters as a measure, and not e.g. the Lebesgue measure. This isn’t appealing to the indifference principle at all, and so in my book it’s not a counting argument. But this could be terminological.