Seth Herd comments on 10 quick takes about AGI

Seth Herd 23 Jun 2023 4:12 UTC
8 points
0
I love the format!
1. agree completely
2. agree completely
3. Capabilities won’t be limited by compute OR data? So capabilities will be purely limited by algorithmic improvements? This seems extreme; it seems like better (if not more) data is still useful, and more compute seems highly relevant today.
4. I agree that a human supergenius that just happens to be able to duplicate themselves and think faster with more compute could probably rearrange all the matter in its light cone. A village idiot could not. And I agree that humans are way smarter at some tasks than others. I think this makes discussion of how smart machines can get pretty irrelevant, and it makes the discussion of how fast they can get smarter less relevant. They’re almost smart enough to take over, and it will take less than a generation (generously) to get there. We need alignment solutions ASAP. It could already be too late to implement solutions that don’t apply to current directions in AI.
5. Agree. Goal direction and abstract reasoning are highly useful for almost any cognitive task. Goal direction allows you to split the task into simpler components and work on them separately. Abstract reasoning saves tons of computation and corresponds to the structure of the world.
6. I think you need to say complex cognition emerges on difficult tasks with enough effort, enough data variety, and good learning algorithms. How much is enough in each category is debatable, and they clearly trade off against each other.
7. Yes, but I don’t think evolution is a very useful upper bound on difficulty of thinking up algorithms. Evolution “thinks” completely differently than us or any agent. And it’s had an unimaginably vast amount of “compute” in physical systems to work with over time.
8. Yes, but again the comparison is of dubious use. There are a lot of molecules in the brain, and if count each receptor as a calculation (which it is), we’d get a very high compute requirement. It’s clearly lower, but how much lower is debatable and debated. I’d rather reason from GPT4 and other existing deep networks directly. GPT4 is AGI without memory, executive function, or sensory systems. Each of those can be estimated from existing systems. My answer would also be about 5x GPT4, except for the likely algorithmic improvements.
9. Agreed, but every time someone says how we “should” build AGI or what we “should” allow it to do, I tune out if it’s not followed immediately by an idea of how we COULD convince the world to take this path. Having AI try doing everything is tempting from both an economic and curiosity standpoint.
10. Absolutely. This is the sharp left turn logic, nicely spelled out in A Friendly Face (Another Failure Story). Honesty prior to contextual awareness might be encouraging, but I really don’t want to launch any system without strong corrigibility and interpretability. I’m aware we might have to try aligning such a system, but we really should be shooting for better options that align with the practicalities of AI development.
- Max H 23 Jun 2023 5:06 UTC
  2 points
  0
  Parent
  On (3), I’m more saying, capabilities won’t be bottlenecked on more data or compute. Before, say, 2019 (GPT2 release) AI researchers weren’t actually using enough data or compute for many potential algorithmic innovations to be relevant, regardless of what was available theoretically at the time.
  
  But now that we’re past a minimum threshold of using enough compute and data where lots of things have started working at all, I claim / predict that capabilities researchers will always be able to make meaningful and practical advances just by improving algorithms. More compute and more data could also be helpful, but I consider that to be kind of trivial—you can always get better performance from a Go or Chess engine by letting it run for longer to search deeper in the game tree by brute force.
  
  And it’s had an unimaginably vast amount of “compute” in physical systems to work with over time.
  
  A few billion years of very wasteful and inefficient trial-and-error by gradual mutation on a single planet doesn’t seem too vast, in the grand scheme of things. Most of the important stuff (in terms of getting to human-level intelligence) probably happened in the last few million years. Maybe it takes planet-scale or even solar-system scale supercomputers running for a few years to reproduce / simulate. I would bet that it doesn’t take anything galaxy-scale.
  
  On (9): yeah, I was mainly just pointing out a potentially non-obvious use and purpose of some research that people sometimes don’t see the relevance of. Kind of straw, but I think that some people look at e.g. logical decision theory, and say “how the heck am I supposed to build this into an ML model? I can’t, therefore this is not relevant.”
  
  And one reply is that you don’t build it in directly: a smart enough AI system will hit on LDT (or something better) all by itself. We thus want to understand LDT (and other problems in agent foundations) so that we can get out in front of that and see it coming.