TurnTrout comments on The “Outside the Box” Box

TurnTrout 28 Mar 2023 2:09 UTC
2 points
0
I think that’s a pretty simplistic view of the post
To clarify, I wasn’t claiming that the point of this post is to mock neural network proponents. It’s not. It’s just a few paragraphs of the post. Updated original comment to clarify.
And Eliezer saying he’s “no fan of neurons” is in the context of him responding to a comment by someone with the username Marvin Minsky defending the book Perceptrons (the post is from the Overcoming Bias era, when comments did not have threading or explicit parents).
Can you say more why you think that context is relevant? He says “this may be clearer from other posts”, which implies to me that his “not being a fan of neurons” is not specific to that specific discussion (since I imagine he wrote those other posts independently of Marvin_Minsky’s comment).
(I have more things to say in response to your comment here, but I’d like to hear your answer to the above first!)
- Vaniver 28 Mar 2023 3:39 UTC
  2 points
  0
  Parent
  Can you say more why you think that context is relevant?
  Yeah; from my perspective the main question here is something like “how much nuance does a statement have, and what does that imply about how far you can draw inferences from it?”. I think people are often rounding Eliezer off to a simplified model and then judging the simplified model’s predictions and then attributing that judgment to Eliezer, in a way that I think is probably inaccurate.
  For this particular point, there’s also the question of what a “fan of neurons” even is; the sorts you see today are pretty different from the sorts you would see back in 2010, and different from the sort that Marvin Minsky would have seen.
  Not as relevant to the narrow point, but worth pointing out somewhere, is that I’m pretty sure that even if Eliezer had been aware of the potential of modern ANNs ahead of time, I think he probably would have filtered that out of his public speech because of concerns about the alignability of those architectures, in a way that makes it not obvious how to count predictions. [Of course he can’t get any points for secretly predicting it without hashed comments, but it seems less obvious that he should lose points for not predicting it.]
  - TurnTrout 3 Apr 2023 18:21 UTC
    15 points
    0
    Parent
    Thanks for the additional response. I’ve thought through the details here as well. I think that the written artifacts he left are not the kinds of writings left by someone who actually thinks neural networks will probably work, capabilities-wise.
    As you read through these collected quotes, consider how strongly “he doesn’t expect ANNs to work” and “he expects ANNs to work” predict each quote:
    In Artificial Intelligence, everyone outside the field has a cached result for brilliant new revolutionary AI idea—neural networks, which work just like the human brain! New AI Idea: complete the pattern: “Logical AIs, despite all the big promises, have failed to provide real intelligence for decades—what we need are neural networks!”
    This cached thought has been around for three decades. Still no general intelligence. But, somehow, everyone outside the field knows that neural networks are the Dominant-Paradigm-Overthrowing New Idea, ever since backpropagation was invented in the 1970s. Talk about your aging hippies.
    ...
    I’m no fan of neurons; this may be clearer from other posts
    ...
    But there is just no law which says that if X has property A and Y has property A then X and Y must share any other property. “I built my network, and it’s massively parallel and interconnected and complicated, just like the human brain from which intelligence emerges! Behold, now intelligence shall emerge from this neural network as well!” And nothing happens. Why should it?
    ...
    Wasn’t it in some sense reasonable to have high hopes of neural networks? After all, they’re just like the human brain, which is also massively parallel, distributed, asynchronous, and -
    Hold on. Why not analogize to an earthworm’s brain, instead of a human’s?
    A backprop network with sigmoid units… actually doesn’t much resemble biology at all. Around as much as a voodoo doll resembles its victim. The surface shape may look vaguely similar in extremely superficial aspects at a first glance. But the interiors and behaviors, and basically the whole thing apart from the surface, are nothing at all alike. All that biological neurons have in common with gradient-optimization ANNs is… the spiderwebby look.
    And who says that the spiderwebby look is the important fact about biology? Maybe the performance of biological brains has nothing to do with being made out of neurons, and everything to do with the cumulative selection pressure put into the design.
    Do these strike you as things which could plausibly be written by someone who actually anticipated the modern revolution?
    there’s also the question of what a “fan of neurons” even is; the sorts you see today are pretty different from the sorts you would see back in 2010, and different from the sort that Marvin Minsky would have seen.
    If Eliezer wasn’t a fan of those particular ANNs, in 2010, because those literal empirically tried setups hadn’t yet led to AGI… That’s an uninteresting complaint. It’s trivial. ANN proponents also wouldn’t anticipate AGI from already-tried experiments which had already failed to produce AGI.
    The interesting version of the claim is the one which talks about research directions, no? About being excited about neural network research in terms of its future prospects?
    I’m pretty sure that even if Eliezer had been aware of the potential of modern ANNs ahead of time, I think he probably would have filtered that out of his public speech because of concerns about the alignability of those architectures
    In the world where he was secretly aware, he could have pretended to not expect much of ANNs. In that case, that’s dishonest. Also risky, it’s possibly safer to just not bring it up and not direct even more attention to the matter. If you think that X is a capabilities hazard, then I think a good rule of thumb is don’t talk about X.
    So, even privileging this “he secretly knew” hypothesis by considering it explicitly, it isn’t predicting observed reality particularly strongly, since “don’t talk about it at all” is another reasonable prediction of that hypothesis, and that didn’t happen.
    in a way that makes it not obvious how to count predictions.
    Let’s consider what incentives we want to set up. We want people who can predict the future to be recognized and appreciated, and we want people who can’t to be taken less seriously in such domains. We do not want predictions to communicate sociohazardous content.
    For sociohazards like this, hashed comments should suffice quite well, for this kind of problematic prediction. You can’t fake it if you can’t predict it in advance. If you can predict it in advance, you can still get credit, without leaking much information.
    I am therefore (hopefully predictably) unimpressed by hypotheses around secret correct predictions which clash with his actual public writing, unless he had verifiably contemporary predictions which were secret but correct.
    [Of course he can’t get any points for secretly predicting it without hashed comments, but it seems less obvious that he should lose points for not predicting it.]
    Conservation of expected evidence. If you would have updated upwards on his predictive abilities if he had made hashed comments and then revealed them, then observing not-that makes you update downwards (eta—on average, with a few finicky details here that I think work out to the same overall conclusion; happy to discuss if you want).
    [EDITed out a final part for now]
    - Vaniver 4 Apr 2023 0:35 UTC
      8 points
      0
      Parent
      Do these strike you as things which could plausibly be written by someone who actually anticipated the modern revolution?
      I do not think I claimed that Eliezer anticipated the modern revolution, and I would not claim that based on those quotes.
      The point that I have been attempting to make since here is that ‘neural networks_2007’, and the ‘neural networks_1970s’ Eliezer describes in the post, did not point to the modern revolution; in fact other things were necessary. I see your point that this is maybe a research taste question—even if it doesn’t point to the right idea directly, does it at least point there indirectly?--to which I think it is evidence against Eliezer’s research taste (on what will work, not necessarily on what will be alignable).
      [I also have long thought Eliezer’s allergy to the word “emergence” is misplaced (and that it’s a useful word while thinking about dynamical systems modeling in a reductionistic way, which is a behavior that I think he approves of) while agreeing with him that I’m not optimistic about people whose plan for building intelligence doesn’t route thru them understanding what intelligence is and how it works in a pretty deep way.]
      Conservation of expected evidence. If you would have updated upwards on his predictive abilities if he had made hashed comments and then revealed them, then observing not-that makes you update downwards (eta—on average, with a few finicky details here that I think work out to the same overall conclusion; happy to discuss if you want).
      I agree with regards to Bayesian superintelligences but not bounded agents, mostly because I think this depends on how you do the accounting. Consider the difference between scheme A, where you transfer prediction points from everyone who didn’t make a correct prediction to people who did make correct predictions, and scheme B, where you transfer prediction points from people who make incorrect predictions to people who make correct predictions, leaving untouched people who didn’t make predictions. On my understanding, things like logical induction and infrabayesianism look more like scheme B.
      - TurnTrout 10 Apr 2023 19:12 UTC
        4 points
        0
        Parent
        I do not think I claimed that Eliezer anticipated the modern revolution, and I would not claim that based on those quotes.
        The point that I have been attempting to make since here is that ‘neural networks_2007’, and the ‘neural networks_1970s’ Eliezer describes in the post, did not point to the modern revolution; in fact other things were necessary.
        I apologize if I have misunderstood your intended point. Thanks for the clarification. I agree with this claim (insofar as I understand what the 2007 landscape looked like, which may be “not much”). I think that the claim is not that interesting, though, but this might be coming down to semantics.
        The following is what I perceived us to disagree on, so I’d consider us to be in agreement on the point I originally wanted to discuss:
        I see your point that this is maybe a research taste question—even if it doesn’t point to the right idea directly, does it at least point there indirectly?--to which I think it is evidence against Eliezer’s research taste (on what will work, not necessarily on what will be alignable).
        I’m not optimistic about people whose plan for building intelligence doesn’t route thru them understanding what intelligence is and how it works in a pretty deep way
        Yeah. I think that in a grown-up world, we would do this, and really take our time.
        On my understanding, things like logical induction and infrabayesianism look more like scheme B.
        Nice, I like this connection. Will think more about this, don’t want to hastily unpack my thoughts into a response which isn’t true to my intuitions here.
  - Unnamed 23 Apr 2023 3:59 UTC
    4 points
    0
    Parent
    I was recently looking at Yudkowsky’s (2008) “Artificial Intelligence as a Positive and
    Negative Factor in Global Risk” and came across this passage which seems relevant here:
    Friendly AI is not a module you can instantly invent at the exact moment when it is first needed, and then bolt on to an existing, polished design which is otherwise completely unchanged.
    
    The field of AI has techniques, such as neural networks and evolutionary programming, which have grown in power with the slow tweaking of decades. But neural networks are opaque—the user has no idea how the neural net is making its decisions—and cannot easily be rendered unopaque; the people who invented and polished neural networks were not thinking about the long-term problems of Friendly AI. Evolutionary programming (EP) is stochastic, and does not precisely preserve the optimization target in the generated code; EP gives you code that does what you ask, most of the time, under the tested circumstances, but the code may also do something else on the side. EP is a powerful, still maturing technique that is intrinsically unsuited to the demands of Friendly AI. Friendly AI, as I have proposed it, requires repeated cycles of recursive self-improvement that precisely preserve a stable optimization target.
    
    The most powerful current AI techniques, as they were developed and then polished and improved over time, have basic incompatibilities with the requirements of Friendly AI as I currently see them. The Y2K problem—which proved very expensive to fix, though not global-catastrophic—analogously arose from failing to foresee tomorrow’s design requirements. The nightmare scenario is that we find ourselves stuck with a catalog of mature, powerful, publicly available AI techniques which combine to yield non-Friendly AI, but which cannot be used to build Friendly AI without redoing the last three decades of AI work from scratch.