evhub comments on Counting arguments provide no evidence for AI doom

evhub 5 Mar 2024 2:39 UTC
LW: 2 AF: 2
0
AF

I definitely thought you were making a counting argument over function space, and AFAICT Joe also thought this in his report.

Sorry about that—I wish you had been at the talk and could have asked a question about this.

You’re making an argument about one type of learning procedure, Solomonoff induction, which is physically unrealizable and AFAICT has not even inspired any serious real-world approximations, and then assuming that somehow the conclusions will transfer over to a mechanistically very different learning procedure, gradient descent.

I agree that Solomonoff induction is obviously wrong in many ways, which is why you want to substitute it out for whatever the prior is that you think is closest to deep learning that you can still reason about theoretically. But that should never lead you to do a counting argument over function space, since that is never a sound thing to do.
What links here?
- A Dialogue on Deceptive Alignment Risks by Rauno Arike (25 Sep 2024 16:10 UTC; 11 points)
- TurnTrout 5 Mar 2024 6:41 UTC
  LW: 0 AF: 1
  0
  AF Parent
  But that should never lead you to do a counting argument over function space, since that is never a sound thing to do.
  Do you agree that “instrumental convergence → meaningful evidence for doom” is also unsound, because it’s a counting argument that most functions of shape Y have undesirable property X?
  - evhub 5 Mar 2024 6:48 UTC
    LW: 4 AF: 3
    1
    AF Parent
    I think instrumental convergence does provide meaningful evidence of doom, and you can make a valid counting argument for it, but as with deceptive alignment you have to run the counting argument over algorithms not over functions.
    - Nora Belrose 5 Mar 2024 7:10 UTC
      4 points
      1
      Parent
      It’s not clear to me what an “algorithm” is supposed to be here, and I suspect that this might be cruxy. In particular I suspect (40-50% confidence) that:
      You think there are objective and determinate facts about what “algorithm” a neural net is implementing, where
      Algorithms are supposed to be something like a Boolean circuit or a Turing machine rather than a neural network, and
      We can run counting arguments over these objective algorithms, which are distinct both from the neural net itself and the function it expresses.
      I reject all three of these premises, but I would consider it progress if I got confirmation that you in fact believe in them.
- Nora Belrose 5 Mar 2024 2:47 UTC
  LW: -14 AF: -8
  −18
  AF Parent
  So today we’ve learned that:
  1. The real counting argument that Evan believes in is just a repackaging of Paul’s argument for the malignity of the Solomonoff prior, and not anything novel.
  2. Evan admits that Solomonoff is a very poor guide to neural network inductive biases.
  At this point, I’m not sure why you’re privileging the hypothesis of scheming at all.
  you want to substitute it out for whatever the prior is that you think is closest to deep learning that you can still reason about theoretically.
  I mean, the neural network Gaussian process is literally this, and you can make it more realistic by using the neural tangent kernel to simulate training dynamics, perhaps with some finite width corrections. There is real literature on this.
  - evhub 5 Mar 2024 2:56 UTC
    LW: 5 AF: 5
    3
    AF Parent
    
    The real counting argument that Evan believes in is just a repackaging of Paul’s argument for the malignity of the Solomonoff prior, and not anything novel.
    
    I’m going to stop responding to you now, because it seems that you are just not reading anything that I am saying. For the last time, my criticism has absolutely nothing to do with Solomonoff induction in particular, as I have now tried to explain to you here and here and here etc.
    
    I mean, the neural network Gaussian process is literally this, and you can make it more realistic by using the neural tangent kernel to simulate training dynamics, perhaps with some finite width corrections. There is real literature on this.
    
    Yes—that’s exactly the sort of counting argument that I like! Though note that it can be very hard to reason properly about counting arguments once you’re using a prior like that; it gets quite tricky to connect those sorts of low-level properties to high-level properties about stuff like deception.
    - Nora Belrose 5 Mar 2024 3:07 UTC
      0 points
      −1
      Parent
      I’ve read every word of all of your comments.
      I know that you think your criticism isn’t dependent on Solomonoff induction in particular, because you also claim that a counting argument goes through under circuit prior. It still seems like you view the Solomonoff case as the central one, because you keep talking about “bitstrings.” And I’ve repeatedly said that I don’t think the circuit prior works either, and why I think that.
      At no point in this discussion have you provided any reason for thinking that in fact, the Solomonoff prior and/or circuit prior do provide non-negligible evidence about neural network inductive biases, despite the very obvious mechanistic disanalogies.
      Yes—that’s exactly the sort of counting argument that I like!
      Then make an NNGP counting argument! I have not seen such an argument anywhere. You seem to be alluding to unpublished, or at least little-known, arguments that did not make their way into Joe’s scheming report.