Jeremy Gillen comments on Jeremy Gillen’s Shortform

Jeremy Gillen 14 Nov 2023 19:33 UTC
1 point
0
There aren’t really any non-extremely-leaky abstractions in big NNs on top of something like a “directions and simple functions on these directions” layer. (I originally heard this take from Buck)
Of course this depends on what it’s trained to do? And it’s false for humans and animals and corporations and markets, we have pretty good abstractions that allow us to predict and sometimes modify the behavior of these entities.
I’d be pretty shocked if this statement was true for AGI.
- ryan_greenblatt 14 Nov 2023 20:31 UTC
  2 points
  0
  Parent
  I think this is going to depend on exactly what you mean by non-extremely-leaky abstractions.
  
  For the notion I was thinking of, I think humans, animals, corporations, and markets don’t seem to have this.
  
  I’m thinking of something like “some decomposition or guide which lets you accurately predict all behavior”. And then the question is how good are the best abstractions in such a decomposition.
  
  There are obviously less complete abstractions.
  
  (Tbc, there are abstractions on top of “atoms” in humans and abstractions on top of chemicals. But I’m not sure if there are very good abstractions on top of neurons which let you really understand everything that is going on.)
  - Jeremy Gillen 14 Nov 2023 21:07 UTC
    3 points
    2
    Parent
    Ah I see, I was referring to less complete abstractions. The “accurately predict all behavior” definition is fine, but this comes with a scale of how accurate the prediction is. “Directions and simple functions on these directions” probably misses some tiny details like floating point errors, and if you wanted a human to understand it you’d have to use approximations that lose way more accuracy. I’m happy to lose accuracy in exchange for better predictions about behavior in previously-unobserved situations. In particular, it’s important to be able to work out what sort of previously-unobserved situation might lead to danger. We can do this with humans and animals etc, we can’t do it with “directions and simple functions on these directions”.